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Cancer risk estimating method using sequence polymorphisms in a sped^S? 
region of chromosome 19 

The present Invention provides methods and compositions for Identifying human 
5 subjects with an increased risk of having or developing cancer. In particular, this 
. Invention relates to the Identification and characterization of polymorphisms in the 
human chromosome 19q, the region r located approximately 19q1 3.2-3 correlated 
with increased risk of developing cancer and the responsiveness- of a subject to 
various treatments for cancer. 



Background 



DNA polymorphisms provide an efficient way to study the association of genes and 
diseases by analysis of linkage and linkage disequilibrum. With the sequencing of 

15 the human genome a myriad of hitherto unknown genetic polymorphisms among 
people have been detected. Most common among these are the single nucleotide 
polymorphisms, also called SNPs, of which we now know several millions. Other 
examples are variable number of tandem repeat polymorphisms. Insertions, dele- 
tions and block modifications. Tandem repeats often have multiple different alleles 

20 (variants), whereas the other groups of polymorphisms usually just have two alleles. 
Some of these genetic polymorphisms probably play a direct role in the biology of 
the individuals, including their risk of developing disease, but the virtue of the major- 
ity is that they can serve as markers for the surrounding DNA, and thus serve as 
leads during as search for a causative gene polymorphism, as substitutes In the 

25 evaluation of its role in health dnd disease, and as substitutes In the evaluation of 
the genetic constitution of individuals. 

The association of an allele of one sequence polymorphism with particular alleles of 
other sequence polymorphisms in the surrounding DNA has two origins, known in 
30 the genetic field as linkage and linkage disequilibrium, respectively. Linkage arises 
because large parts of chromosomes are passed unchanged from parents to off- 
spring, so that minor regions of a chromosome tend to flow unchanged from one 
generation to the next and also to be similar in different branches of the same fam- 
ily. Linkage is gradually eroded by recombination occurring in the cells of the germ- 
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line, but typically operates over multiple generations and distances of a number of 
minion bases in the DNA 

Linkage disequilibrium deals with whole populations and has its origin in the (distant) 
5 forefather in whose DNA a new sequence polymorphism arose. The Immediate sur- 
roundings In the DNA of the forefather will tend to stay with the new allele for many 
generations. Recombination and changes in the composition of the population will 
again erode the association* but the new allele and the alleles of any other polymor- 
phism nearby will often be partly associated among unrelated humans even today. A 

10 crude estimate suggests that alleles of sequence polymorphisms with distances less 
that 10000 bases in the DNA will have tended to stay together since modem man 
arose. Linkage disequilbrium In limited populations, for instance Europeans, often 
extends over longer distances. This can be the result of newer mutations, but can 
also be a consequence of one or more "bottlenecks" with small population sizes and 

15 considerable inbreeding In the history of the current population. Ttoo obvious possi- 
bilities for "bottlenecks" in Europeans are the exodus from Africa and the repopula- 
tion of Europe after the last ice age. 

Linkage disequilibrium is the results of many stochastic events and as such subject 
20 to statistical variation occasionally resulting in discontinuities, lack of a rnonotonic 
relationship between association and distance and differences between people of 
different ethnicity. Therefore, it is often advantageous to study more that one se- 
SUence.polymorphlsm in a given region, This also allows for further definition of the 
genetic surroundings of the biologically relevant polymorphism by combining the 
25 associated alleles of the different markers into a socalled haplotype. 

Humans in general cany two copies of each human chromosome in each cell. There 
are exceptions to. this rule, not relevant to this application. We therefore speak about 
genotypes Le. the combined analysis of both chromosomes at a given sequence 
30 polymorphism. The resulting genotypes of a person, analysed for instance on DNA 
from peripheral blood leukocytes, are inherently very 'stable over time. Therefore, 
this type of analysis can be performed any time in the We of a person and will be 
applicable to this person for his or her entire life. By the same token such genetic 
analyses are Ideally suited to predict future risks of disease. 

35 
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A variety of investigations suggest that many diseases in part are determined by the 
genetic constitution of the individual. One group of genes in particular has been as- 
sociated with rare genetic predispositions to cancer. These are the genes involved 
in maintaining the integrity of a personsONA, the so-called DNA repair genes. One 
5 set of such genes are the XP genes which participate in nucleotide excision repair, 
and, when mutated, give rise to a 1 000 fold- increased risk of getting skin cancer. For 
this reason we have previously investigated single nucleotide polymorphisms in one 
DNA repair gene XPD for association with risk of skin cancer in a cohort of Cauca- 
sian Americans, and found that one allele of the sequence polymorphism called 
10 XPDe6 was associated with a moderately increased risk of getting basal ceil carci- 
noma, the most common form of skin cancer. Later other groups have studied the 
association between sequence polymorphisms in this and other ONA repair genes 
and various forms of cancer. Some have reported positive results. 

15 Very little is known about the function of the gene RAI. It was cloned because its 
protein product binds to and Inhibits RelA of the transcription regulator NF-kappaB. 

Summary of the Invention 

20 The present invention relates in a first aspect to a group of nucleic acid sequences 
found to be associated with cancer; The invention further relates to transcriptional 
and translationai products of said sequence. An allele (n the r region can be identi- 
fied as correlated with an increased risk of developing canoer, the prognosis of de- 
veloped cancer, and responsiveness to cancer treatment on the basis of statistical 

25 analyses of the incidence of a particular allele in individuals diagnosed with cancer. 

Thus, In a first aspect the invention relates to a method for estimating the cancer risk 
of an Individual comprising 

30 - providing a sample from said individual, 

. assessing In the genetic material including human genes in said sample a se- 
quence polymorphism 

35 - in a region corresponding to SEQ ID NO: 2, or a part thereof, or 
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- in a region complementary to SEQ ID NO: 2, or a part thereof, or 

- In a transcription product from a sequence in a region corresponding to SEQ 
ID NO: 2, or.a part thereof, or 

- or translation product from a sequence in a region corresponding to SEQ ID 
NO: 2, or a part thereof, 

- obtaining a sequence polymorphism response, 

- estimating the cancer risk of said Individual based on the sequence polymor- 
phism response. 

Preferably the invention relates to a method for estimating the cancer risk of an indi- 
vidual comprising 

• providing a sample from said individual, 

- assessing in the genetic material including human genes in said sample a se- 
quence polymorphism 

- in a region corresponding to SEQ ID NO: 1 , or a part thereof, or 
20 - in a region complementary to SEQ ID NO: 1 , or a part thereof, or 

• in a transcription product from a sequence in a region corresponding to SEQ 
ID NO: 1, or a part thereof, or 

- or translation product from a sequence In a region corresponding to SEQ ID 
NO: 1 , or a part thereof, 

25 - obtaining a sequence polymorphism response, 

- estimating the cancer risk of said individual based on the sequence polymor- 
phism response. 

30 The estimation of the cancer risk of an individual can involve the comparison of the 
number and/or kind of polymorphic sequences identified with a predetermined can- 
cer risk profile. Such a profile can be based on statistical data obtained for a rele- 
vant reference group of individuals. 
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The sequence of the r regtonis set forth as SEQ ID N0 1, originating from the clon- 
ing of human chromosome 19q published as part of the contig NTJ>1 1109 In the 
database of human sequences established by National Center for Biotechnology 
Information and located on the internet at 
5 http £ / /www > ncbi . nlnt . nth . gov/aenome/guide/hmnan/ 

The presence of an allele is determined by determining the nucleic acid sequence of 
all or part of the region according to standard molecular biology protocols well 
known in the art as described for example in Sambrcok et aL (1989) and as set forth 
10 in the Examples provided herein or products of the nucleic acid sequences. 

In particular,, the nucleic acid molecules of the present Invention represent in a first 
aspect nucleic acid sequences forming part of the region r corresponding to position 
1522-37752 of SEQ ID NO: 1, and preferably to certain nucleic add sequences 
15 within the gene referred to herein as RAJ. As demonstrated in the Examples pre- 
sented below, the RAI gene is in particular associated with human cancer diseases. 

Furthermore, the Invention relates to a method for estimating the cancer prognosis 
of an individual comprising 



20 



- providing a sample from said Individual, 

- assessing in the genetic material including human genes in said sample a se- 
quence polymorphism 



25 



- in a region comesponding to SEQ ID NO: 1 or SEQ ID NO: 2. or a part 
thereof, or 

- in a region complementary to SEQ ID NO: 1 or SEQ ID NO: 2, or a part 
thereof, or 

30 - - In a transcription product from a sequence in a region corresponding to SEQ 
ID NO: 1 or SEQ ID NO: 2, or a part thereof, or 
• or translation product from a sequence In a region corresponding to SEQ ID * 
NO: 1 or SEQ ID NO: 2. or a part thereof, 

- obtaining a sequence polymorphism response, 

35 
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- estimating the cancer prognosis of said individual based on the sequence poly- 
morphism response. 

The estimation of the cancer prognosis of an individual can Involve the comparison 
5 of the number and/or kind of polymorphic sequences identified with a predetermined 
cancer prognosis profile. Such a profile can be based on statistical data obtained for 
a relevant reference group of individuals. 

Additionally provided Is a method of identifying a human subject as having an in- 
to creased likelihood of responding to a treatment,- comprising a) correlating the pres- 
ence of an r region allele genotype with an increased likelihood of responding to 
treatment; and b) determining the r region allele genotype of the subject, whereby a 
subject having an r region allele genotype correlated with an Increased likelihood of 
responding to treatment 19 identified as having an increased likelihood of responding 
15 to treatment. 

Thus, the present invention also relates to method for estimating a treatment re- 
sponse of an Individual suffering from cancer to a cancer treatment comprising 

20 . - providing a sample from said individual 

- assessing in the genetic material Including human genes in said sample a se- 
quence polymorphism 

25 - id a region corresponding to SEQ ID NO: 1 or SEQ ID NO: 2. or a part 

thereof, or 

- In a region complementary to SEQ ID NO: 1 or SEQ ID NO: 2, or a part 
thereof, or 

- In a transcription product from a sequence In a region corresponding to SEQ 
30 ID NO: 1 or SEQ ID NO: 2. or a part thereof, or 

- or translation product from a sequence In a region corresponding to SEQ ID 
NO: 1 or SEQ ID NO: 2, or a part thereof, 

- obtaining a sequence polymorphism response, 
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- estimating the individual's response to the cancer treatment based on the se- 
quence polymorphism response. 

The estimation of the individual's response to cancer treatment can involve the 
S comparison of the number and/or kind of polymorphic sequences identified with a 
predetermined cancer treatment response profile. Such a profile can be based on 
statistical data obtained for a relevant reference group of individuals. 

The Invention also comprises primers or probes for use in the invention, as well as 
10 kits including these. The primers and/or probes are preferably capable of hybridising 
to SEQ ID NO:1 or SEQ ID NO: 2, or a part thereor, in particularly the r region, or a 
part thereof, under stringent conditions. 

Furthermore, the invention also relates to cfoning vectors and expression vectors 
15 containing the nucleic acid molecules of the invention, as well as hosts which have 
been transformed, with such nucleic acid molecules, including celts genetically engi- 
neered to contain the nucleic acid molecules of the invention, and/or cells geneti- 
cally engineered to express the nucleic acid molecules of the invention. The nucleic 
acids are preferably isolated form the r region and preferably contain one or more 
20 sequence polymorphisms as described herein below in more detail. In addition to 
host cells and cell fines, hosts also include transgenic non-human animals (or prog- 
eny thereof). 

In particular, the present invention is based on the discovery of the correlation with 
25 single nucleotide polymorphisms (SNPs) and/or tandem repeats in the r region and 
cancer. Thus. SNPs have been found In the r region as shown in table 1. However, 
the present invention is not limited to the SNPs shown in table 1 v but does include 
any SNP In the region. Tandem repeats have been found in the r region as shown in 
table 2. However, the present invention is not limited to the tandem repeats shown 
30 in table 2, but does Include any tandem repeat In the region. 

The term human includes both a human having or suspected of having a cancer 
disease and an a-symptomatic human who may be tested for predisposition or sus- 
ceptibility to cancer. At each position the human may be homozygous for an allele or 
35 the human may be a heterozygote. 
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Drawings 

Fig. 1 shows a subreglon of chromosome 1 9q 

S 

Fig. 2 shows odds ratios and p-values for individual sequence variations in relation 
to risk of basal cell carcinoma 

Fig. 3 shows odds p-values for association of different sequence variations with risk 
10 of basal cell carcinoma among psoriatic Danes 

Fig. 4 shows regions SI. S2 and S3 of SEQ ID NO: 2. 

Detailed description of the invention 

15 

The present invention relates to a characterization of a person's present and/or fu- 
ture risk of getting certain forms of cancer. The characterization is based on the 
analysis of sequence polymorphisms in a region of chromosome 19q in the person. 

20 A number of polymorphisms in the chromosomal region 19q1 3.2-3 have been Iden- 
tified and characterised. Surprisingly, the sequence polymorphisms with strongest 
association to disease appeared to be located outside XPD. More specifically, the 
sequences were located in a sub-region between XPD and ERCC1, and seemed to 
have a maximum In or around the gene RAJ (See Example 1). For persons getting 

25 their skin cancer relatively early (before 50 years of age), it was found that predic- 
tions got better (Example 2) and when two sequence polymorphisms in RAJ were 
combined, the prediction of early skin cancer got even better (Example 3). It was 
also possible to combine sequence polymorphisms in RAI with sequence polymor- 
phisms outside the region and get highly positive results (Example 4). 



30 



The region of chromosome igq, more precisely the region located in 19q1 3.2-3, with 
which the present invention is concerned, is depicted In Figure 1 as it is presently 
known together with the presently known or suspected genes. The arrows indicate 
the directions of transcription of the genes. The absolute chromosome positions 
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shown are from the particular build of NCBI's map of chromsome 19, and will proba- 
bly change with time. 

The region s stretches from the XPO gene to approximately the end of ERCC1 and 
5 includes the region r and is defined by SEQ ID NO: 2. In the present context the 
region s means SEQ ID NO: 2 and complementary sequence as well as transcrip- 
tional products and translational products thereof. 

One preferred section of the region s is S1 as shown in Fig. 4, more preferred* S2 as 
10 shown In Fig. 4, most preferred S3 as shown in Fig. 4. 

Ttie region r stretches from the beginning of, but not includingf the XPD gene, to 
approximately the end of ERCC1 and Includes the genes RAI, LOC132978, and 
ASE-1. More specifically r is bounded by and Includes the following two sequences: 
15 AGAACCCCCG CCCCTCCACC TCGTCTCAAA and TCCCTCCCCA GA- 
GACTGCAC CAGCGCAGCC, and is defined by SEQ ID NO: 1. 

In the present context the region r means SEQ ID NO: 1 and complementary se- 
quence as well as transcriptional products and translational products thereof. 

20 

One preferred section of the region r stretches approximately from the end of RAI to 
the end of ASE-1 and Includes the genes RAI, LOC162978, and ASE-1. More spe- 
cifically, this section of £ is bounded by and includes the following sequences; 
GAAGTGAGCC AAGATCACGC CACTGCACTC and GTGCCCACCT GGGCCAC- 
25 CAG AAGGTGACAC. In the present contort the region r means SEQ ID NO: 1 
bases 1522-37752 and complementary sequence as well as transcriptional products 
and translational products thereof. 
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Rnally. in- the claims the gene RAl Is defined as including transcribed sequences of 
the gene plus a 1500 basa upstream promoter region. More specifically RAl is 
bounded by and includes the following sequences: CATAACCACA ATGATGAGCA 
TGTATTGAGT and ATGTTGTCCA GGCTGGTCTT GAACT CCTGA- In the present 
5 context this section of the region r relates to SEQ ID NO: 1 bases 7761-22885 and 
complementary sequence as well as transcriptional products and translations! prod- 
ucts thereof. 



Modifications to the human genome map are known to occur from time to time. It is 
10 therefore possible that the defining sequences quoted above will change slightly In 
future maps. 

Fragments or parts of the region s or r as used herein relates to any fragment of at 

least 100 nucleic acid redues in length, or mutiples of 100 nucleic acid residues In 
15 length, starting from SEQ ID NO: 1 position 1. 100, 200. 300, 400, 600, 600, 700, 

800. 900, 1000. 1100. 1200. 1300. 1400; 1500. 1600, 1700, 1800. 1900, 2000. 

2100. 2200, 2300, 2400, 2500, 2600. 2600, 2700, 2800, 2900, 3000, and so forth. 

each fragment starting position having an increment of 100 nucleic acid residues. 

Multiples are preferably multiples of e.g. 1. 2, 3. 4 f 5. 6, 7. 8. 9, 10, 11, 12. 13. 14. 
20 15, 16. 17, 18, 19. 20. 21, 22, 23, 24, 25, 26, 27, 28. 29, 30, 31. 32. 33, 34, 35, 36, 

37, 38, 39. 40. 41. 42, 43, 44. 45. 46, 47, 48. 49 and 50. 

For fragments starting ai position 1, the length of said fragments will thus be e.g. 
100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 
25 1600, 1700. 1800. 1900, 2000. 2100, 2200. 2300, 2400, 2500, 2600, 2600, 2700, 
2800, 2900, 3000, and so forth, using suitable multiplicators as listed herein above. 

For fragments starting at position 100, the length of said fragments will thus be e.g. 
100, 200, 300, 400, 500, 600, 700, 800, 900, 1000. 1100, 1200, 1300, 1400, 1500. 
30 1600. 1700. 1800. 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2600, 2700, 
2800, 2900. 3000. and so forth, using suitable multiplicators as listed herein above. 

For fragments starting at position 7700, the length of said fragments will thus.be e.g. 
100, 200. 300, 400, 500. 600, 700, 800. 900, 1000, 1100. 1200. 1300, 1400, 1500, 
35 1600, 1700, 1800. 1900. 2000. 2100, 2200, 2300. 2400, 2500, 2600, 2600, 2700. 



13 




29/04 2005 12:14 FAX 3332038^^7 HOIBERG A/S - PV^^ @014 

* ■ 

P687 DKG3 

11 

2800, 2900. 3000. 3500, 4000, 4500, 5000, 5500. 6000, 6500, 7000. 7500. 8000. 
8500. 9000. 9500. 10000. 10500. 11000, 11500, 12000. 12500, 13000, 13500, 
14000. 14500, 15000, and so forth, using suitable muttlpllcators such as e.g. the 
ones listed hereinabove. 

5 

The nucleic acid sequences according to the present invention makes it possible to 
estimate cancer risk in an Individual by using sequence polymorphisms originating 
from a specific region of chromosome 19. 

16 Estimation of cancer risks has a number of Important applications: 

(1) Individuals with reasons to suspect that they are at risk for getting cancer would 

* be able to clarify their situation and, If possible, take protective action. Alternatively, 

anti-cancer campaigns, companies, hospitals or other institutions could offer a serv- 

15 Ice to help people clarify their situation. It would for Instance be possible to test per- 
sons, when they got their first basal cell carcinoma, which is often recurrent and also 
is a moderate predictor for other cancers. If the persons wens in a high-risk group, 
one could then advice them about, or they could of their own accord choose, risk- 
reducing behaviour, such as avoidance of excessive sun-exposure, abstaining from 

20 smoking etc. About 5 percent of the Danish population will at some point in their life 
get a basal cell carcinoma. 

(2) Ana-cancer campaigns, companies, hospitals or other Institutions would be able 
to define relevant target subpopulatlons and focus information on risk-reducing be- 

25 havlour on these persons. They might perhaps also be in a position to inform the 
remainder of the population that they need not worry. Lung cancer affects approxi- 

* mately 10-15 percent of smokers and thus approximately 5 percent of the popula- 
tion, somewhat varying from country to country. Malignant melanoma, a sun- 
induced, often lethal form of skin cancer, affects approximately 700 persons a year 

30 In Denmark or about 1 percent of the Danish population. 

(3) The drugs used in cancer treatment are often carcinogenic themselves and indi- 
vidual responses to them vary considerably, both with respect to tolerance to the 
treatment and with respect to efficacy of the treatment. It is an obvious possibility 

35 that the region of chromosome 19 here dealt with, which contains DNA repair genes 
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known to modulate carcinogen responses, also modulates response to anti-cancer 
agents. Hence, analysis of the region may facilitate better choices of treatment for 
cancer, and/or help predict the future course of disease. 

By sequence polymorphism is understood any single nucleotide, tandem repeat, 
Insert, deletion or block polymorphism, which varies among humans, whether It is of 
biological importance or not. 
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Position of sequence polymorphism in the region s and r 

in one embodiment of the methods of the Invention, preferably the method for diag- 
nosis as described herein, one or more single nucleotide polymorphism^) at a pre- 
determined position in the region r (SEQ ID NO:1) are identified and used for e.g. 
cancer risk profiling and/or cancer treatment response profiling. Presently preferred 
single nucleotide polymorphism(s) are listed in Tables 1a and 1b. However, the pre- 
sent invention relates to any SNP in the r region. 



Table 1a 



20 Identification in dbSNP 1 
rs#3138378 A/G 
rs#3 138376 G/T 
rs#209725 C/A 
rs#2377328 C/T 

25 rs#8986 A/T 

rs#2017154 A/C 
rs#2017104 A/G 
rs#2070830 T/G 
rs#1970764 A/G 

30 rs#2226949 T/G 
rs#959457 C/T 
rs#233S218 C/A 
rs#766934 A/G 
rs#928911 C/T 

35 rs#1005165 C/T 



Position in SEQ ID NO: 1 

137 
235 

ambiguous location 
7199 

7887 (=RAIe6) 
12115 
12190 
14575 

15798 (=RAIi1) 

32035 

32446 

32447 

32481 

32785 

33974 
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rs#1005166 C/T 
rs#967591 AIG 
ra#1 046282 T/C 
rs#2013521 ATT 
rs#735482 A/C 
re#762562 A/Q 
rs#2336919 
rs#743571 C/G 



34119 

34858 (=ASE-1e1) 

35596 

36254 



37267 

ambiguous location 
37786 
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20 



25 
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Table 1b 



Identification 

in dbSNP 1 

ro#3Q4756q 

rs?5000l50 

re*45896fiS 

ra#4803B14 

rs#4Q0361S 

ro#4S725l4 

ra#48022S2 

ra*4903S16 

re*4802233 

rnft4353560 

rots 2129 8 9 

ro#3212988 

rp#3213997 

*p#32129B6 



ataaaaaaat 
tgttgtccaa 

CCAGGGCATA 

cctgcctgct. 
ctcgcctgct 
CTGTTCA6GC 
agccaccaca 
GAjGCCTATTG 
CCTAACCCAG 
GTAAGTGACt 
TCGCOOACAG 
TGGCTGAGAC 
GTGTGACCTC 
GCTGCTGCTG 



aaaaaaaa 1-/AA) atagccgagc afcggfcggtgg 
gctggCAGAG (A/G) fctctttgtttg tttgtttgag 
CAACCAGCAC (T/A) TGATTTTctg tgtgaccfcca 
tgctttctet (C/T) cccctcctcc tt^cttccxn 

CCCtCLCtCL (C/T) tCeCCCtttC CCtCttCCCt 
TGGOGGCTCA (C/T) TTGGATOAAC AQSGGAGTGTO 
CCtggccAAA (C/T) CACCTATTCT GAAACGCCCC 
TTGGAAAGTT <C/T) TGAGTCCAAG ATTCTATCTT 
GOTTGCACTG (C/T) TCTGGAAGTC TAGATGGATG 
cttttttttt (C/T) tttcggtaga gactcagccn 
GACTG (C/T) CTCTTCTAGA GGCTCAGTCT 
TCAAC (C/T) GTCACCCCCT CCTCTGGCTC 



TCTCT 
CTGCT 



<-/TTC) TTCTTCTTCT TCTTCTTGGT 
(T/G) CTTCCGCTTC TTGTCCCGGC 



Position In 
SEQIDNO:1 

4795-6 
6908 
20613 
25650 
2U654 
28691 
29886 
29B15 
29922 
30439 
36994 
37068 
37431-37433 
37660 



1 dbSNP is the database over SNPs established by the National Center for Biotech- 
nology Information and located on the Internet at htto://www.ncbi.nlm.nih.qov/SNP/ > 



In another embodiment of the Invention preferably the method described herein is 
one in which the tandem repeat is at a position as described in Table 2: 
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Table 2 

Identification in unlSTS 2 

5 D19S908 

STS-W67936 
D19S543 
D19S393 
STS-R48186 
10 GDB:181915 
RH47033 
QDB:1 90019 

2 UniSTS is a database of unique sequence tag sites established by National Center 
for Biotechnology Information and located on the internet at 

15 http?//www.ncbi.nlm.nih>gov/entrez/qu6ry.£cgi?dbgunists 

In another embodiment of the invention, the method for diagnosis described herein 
is preferably one In which the sequence polymorphism is In region r. Testing for the 
presence of the RAI gene allele is especially preferred because, without wishing to 
20 be bound by theoretical considerations, of its association with increased risk of can- 
cer (as explained herein). 

In one embodiment of the methods of the invention, preferably the method for diag- 
nosis as described herein, one or more single nucleotide polymorphism^) at a pre- 
25 determined position in the region s (SEQ ID IMO:2) are Identified and used for e.g. 
cancer risk profiling and/or cancer treatment response profiling. Presently' preferred 
polymorphism^) are the four base pair deletion shown in Fig. 4 corresponding to 
TGTC. However, the present invention relates to any polymorphism and SNP in the 
s region. 



30 



The sequence polymorphism of the invention comprises at least one base differ- 
ence, such as at least two base differences. As described above the sequence poly- 
morphism comprises at least one single nucleotide polymorphism, such as at least 
two single nucleotide polymorphisms; Also, the sequence polymorphism comprises 



17 



29/04 2003 12:15 PAZ 33320^*^/ HOIBERG A/S ^ WStjS? ® 018 

P687DK03 

15 

at least one tandem repeat polymorphism, such as at least two tandem repeat poly* 
morphlsms. 



Also, the sequence polymorphism may be a combination of single nucleotide poly- 
5 morphism and tandem repeats. 



The status of the individual may be determined by reference to allelic variation at 
one. two, three, four or more of the above loci. 

10 Cell sample 

The cell sample used in the present invention may be any suitable cell sample ca- 
pable of providing the genetic material for use in the method. In a preferred em- 
bodiment, the cell sample is a blood sample, a tissue sample, a sample of secretion, 
15 semen, ovum, a washing of a body surface (e.g. a buccal swap), a clipping of a 
body surface (hairs, or nails), such as wherein the cell is selected from white blood 
cells and tumour tissue. 



It will be appreciated that the test sample may equally be a nucleic add sequence 
20 corresponding to the sequence in the test sample, that is to say that all or a part of 
the region in the sample nucleic acid may firstly be'amplified using any convenient 
technique e.g. PGR, before use in the analysis of variation In the region. 

Detection methods 

25 

Detection may be conducted on the sequence of , SEQ ID NO: 1, SEQ ID NO: 2 or 
a complementary sequence as well as on tra relational (mRNA) and transcriptional 
products (polypeptides, proteins) therefrom. 



30 It will be apparent to the person skilled in the art that there are a large number of 
analytical procedures which may be used to detect the presence or absence of vari- 
ant nucleotides at one or more of positions mentioned herein in the r region. Muta- 
tions or polymorphisms within or flanking the r region can be detected by utilizing a 
number of techniques. Nucleic add from any nucleated cell can be used as the 

35 starting point for such assay techniques, and may be isolated according to standard • 
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nucleic acid preparation procedures that are well known to those of skill in the art In 
general, the detection of allelic variation requires a mutation discrimination tech- 
nique, optionally an amplification reaction and a signal generation system. Table 3 
lists a number of mutation detection techniques, some based on the PGR. These 
5 may be used in combination with a number of signal generation systems, a selection 
of which is listed in Table 4. Further amplification techniques are listed in Table 5. 
' Many current methods for the detection of allelic variation are reviewed by Nollau et 
aL, Clin. Chem. 43, 1 1 14-1120, 1997; and in standard textbooks, for example "Labo- 
ratory Protocols for Mutation Detection", Ed. by U. Landegren, Oxford University 
10 Press. 1996 and "PCR", 2.sup.nd Edition by Newton & Graham, BIOS Scientific 
Publishers Limited, 1997. 





Table 3 






Abbreviations: 




15 


ALEX TM 


Amplification refractory mutation svstem linear extension 
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ARMS .TM. 


Amplification refractory mutation system 




b-DNA 


Branched DNA 




CMC 


Chemical mismatch cleavage 


20 


bp 


base pair 




COPS 


Competitive oligonucleotide priming system 




DGGE 


Denaturing gradient gel electrophoresis 




FRET 


Fluorescence resonance energy transfer 




LCR 


Ligase chain reaction 


25 


MASDA 


Multiple allele specific diagnostic assay 




NASBA 


Nucleic acid sequence based amplification 




OLA 


Oligonucleotide ligation assay 




PCR 


Polymerase chain reaction 




PTT 


Protein truncation test 


30 


RFLP 


Restriction fragment length polymorphism 




SOA 


Strand displacement amplification 




SNP 


Single nucleotide polymorphism 




SSCP 


Single-strand conformation polymorphism analysis 




SSR 


Self sustained replication 


35 


TGGE 


Temperature gradient gel electrophoresis 
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Table 4 Illustrates various mutation detection techniques capable of being used for 
SNP detection. 

5 Table 4 

General techniques: DNA sequencing, Sequencing by hybridisation, SNAPshot. 

Scanning techniques: PJT*, SSCP, DOGE, TGGE, Cleavase, Heteroduplex analy- 
10 sis, CMC, Enzymatic mismatch cleavage 

Hybridisation Based techniques 

Solid phase hybridisation: Dot blots, MASDA, Reverse dot blots, Oligonucleotide 
1 5 arrays (DNA Chips) 

Solution phase hybridisation: Taqman.TM.-US. Pat. No. 5,210,015 & 5,487,972 
(Hoffmann-La Roche), Molecular Beacons-Tyagi et al (1996), Nature Biotechnolo- 
gy. 14. 303: WO 95/13399 (Public Health Inst, New York), Lightcycier, optionally in 
20 combination with FRET. 

Extension Based: ARMS.TM., ALEX.TM. -European Patent No. EP 332435 B1 
(Zeneca Limited), COPS-Gibbs et ai (1969) v Nucleic Adds Research, 17, 2347. 

25 Incorporation Based: Mini-sequencing, APEX 

Restriction Enzyme Based: RFLP, Restriction site generating PCR 

Ligation Based: OLA 

30 

Other. Invader assay 

Various Signal Generation or Detection Systems is listed below: 



20 
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Fluorescence: FRET. Fluorescence quenching, Fluorescence polarisation-United 
Kingdom Patent No. 2226996 (Zeneca Limited) 

Other Chemiluminescence, Electrochemiluminescence, Raman, Radioactivity, Col- 
5 orimetrie. Hybridisation protection assay. Mass spectrometry 



Table S illustrates examples of further amplification techniques. 
10 Table 5 

SSR, NASBA, LCR, SOA, b-DNA 

Preferred mutation detection techniques include ARMS.TM., ALEX.TM., COPS, 
15 Taqman, Molecular Beacons, RFLP, and restriction site based PGR and FRET 
techniques. 



Particularly preferred methods include FRET; taqman, ARMS.TM. and RFLP based 
methods. 

20 

In a preferred embodiment, mutations or polymorphisms can be detected by using a 
microassay of nucleic acid sequences immobilized to a substrate or "gene chip" 
(see, e.g. Cronin, et at., 1996, Human Mutation 7:244-255). 

25 Further, Improved methods for analyzing DNA polymorphisms, which can be utilized 
for the identification of region r specific mutations, have been described that capital- 
ize on the presence of variable numbers of short, tandemly repeated DNA sequen- 
ces between the restriction enzyme sites. For example, Weber (U.S. Pat. No. 
5,075,217) describes a DNA marker based on length polymorphisms in blocks of 

30 (dC-dA)n-(dG-dT)n short tandem repeats. The average separation of (dC-dA)n-(dG- 
dT)n blocks is estimated to be 30,000-60.000 bp. Markers that are so closely 
spaced exhibit a high frequency co-Inheritance, and are extremely useful In* the 
identification of genetic mutations, such as, for example, mutations within the RAI 
gene, and the diagnosis of diseases and disorders related to RAI mutations. 

35 
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Also, Caskey et al. (U.S. Pat No. 5,364,759) describe a DNA profiling assay for 
detecting short tri and tetra nucleotide repeat sequences. The process includes ex- 
tracting the DNA of Interest, such as the RA1 gene, amplifying the extracted DNA, 
and labelling the repeat sequences to form a genotypic map of the individual's DNA. 

5 

The level of RAI gene expression can also be assayed. For example, RNA from a 
cell type or tissue known, or suspected, to express the RAI gene, such as brain, 
may be isolated and tested utilizing hybridization or PGR techniques such as are 
described, above. The isolated cells can be derived from cell culture or from a pa- 
10 tfent The analysis of cells taken from culture may be a necessary step in the as- 
sessment of cells to be used as part of a cell-based gene therapy technique or, al- 
ternatively, to test the effect of compounds on the expression of the RAI gene. Such 
analyses may reveal both quantitative and qualitative aspects of the expression 
pattern of the RAI gene, Including activation or inactivatlon of RAI gene expression. 

15 

In one embodiment of such a detection scheme, a cDNA molecule is synthesized 
from an RNA molecule of interest (e.g. t by reverse transcription of the RNA mole- 
cule Into cDNA). A sequence within the cDNA is then used as the template for a 
nucleic add amplification reaction, such as a PCR amplification reaction, or the like. 

20 The nucleic acid reagents used as synthesis initiation reagents (e.g., primers) in the 
reverse transcription and nucleic acid amplification steps of this method are chosen 
from among the RAI gene nucleic acid reagents described above. The preferred 
lengths of such nucleic acid reagents are at least 8*30 nucleotides. For detection of 
the amplffied product, the nucleic acid amplification may be performed using radio- 

25 actively, or non-radioactively labeled nucleotides. Alternatively, enough amplffied 
product may be made such that the product may be visualized by standard ethidium 
bromide staining or by utilizing any other suitable nucleic acid staining method. 

Additionally, it is possible to perform such RAI gene expression assays "in situ". I.e., 
30 directly upon tissue sections (fixed and/or frozen) of patient tissue obtained from 
biopsies or resections, such that no nucleic acid purification is necessary. Nucleic 
acid reagents such as those described above may be used as probes and/or prim- 
ers for such in situ procedures (see, for example, Nuovo, G. J., 1992, "PCR In Situ . 
Hybridization: Protocols And Applications", Raven Press, NY). 

' 35 
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Alternatively, if a sufficient quantity of the appropriate celts can be obtained, stan- 
dard Northern analysis can be performed to determine the level of mRNA. expres- 
sion of the RAI gene. 

5 Activity of the gene 

Another method for detecting sequence polymorphism is by analysing the activity of 
gene products resulting from the sequences. Accordingly, in one embodiment the 
detection uses the activity of the RAI gene product as compared to a reference in 
10 the method. In particular if the activity of the genes are decreased or increased by at 
least or about 50 % # such as at least or about 40%, for example at least or about 
30%, such as at least or about 20%, for example at least or about 10%, such as at 
least or about 10%, for example at least or about 5%, such as at least or about 2%, 
it indicates a sequence polymorphism In the gene. 



15 



Mutations outside the region 



The present invention may combine the result of sequence polymorphism within the 
region r or s with sequence polymorphism outside the region in order to increase the 
20 probability of the correlation. 

Primers 

The primer nucleotide sequences of the invention further include: (a) any nucleotide 
25 sequence that hybridizes to a nucleic acid molecule of the region r or its comple- 
mentary sequence or RNA products under stringent conditions, e.g., hybridization to 
filter-bound DNA in 6x sodium chloride/sodium citrate (SSC) at about 45°C followed 
by one or more washes In 0.2x SSC/0.1 % SDS at about 50-65°C, or (b) under highly 
stringent conditions, e.g., hybridization to filter-bound nucleic acid in 6x SSC at 
30 about 45°C followed by one or more washes in 0.1x SSC/0.2% SDS at about 68°C, 
or under other hybridization conditions which are apparent to those of skill in the art 
(see, for example, Ausubel FM et al.. eds., 1989, Current Protocols In Molecular 
Biology, Vol. I, Green Publishing Associates, Inc., and John Wiley & sons, Inc., New 
York, at pp. 6.3.1-6.3.6 and 2.10.3). Preferably the nucleic acid molecule that hy- 
35 bridrzes to the nucleotide sequence of (a) and (b). above, is one that comprises the 
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complement of a nucleic add molecule of the regton r or a complementary sequence 
or RNA product thereof* In a preferred embodiment, nucleic acid molecules com- 
prising the nucleotide sequences of (a) and (b), comprises nucleic acid molecule of 
RAI or a complementary sequence or RNA product thereof. 

Among the nucleic acid molecules of the invention are deoxyoligonucfeotides foli- 
gos") which hybridize under highly stringent or stringent conditions to the nucleic 
acid molecules described above. In general, for probes between 14 and 70 nucleo- 
tides in length the melting temperature (TM) is calculated using the formula: 

Tm(°C)=81.5+16.8(log [monovalent cations (moiar)])+0.41(% G+C)-(500/N) 

where N is the length of the probe. If the hybridization is carried out in a solution 
containing formamlde. the melting temperature is calculated using the equation 
Tm(°C)=81^ie.6(log[monovatent cations (molar)J)+Q.41(% G+CHO-61% formam- 
fde)-(500/N) where N is the length of the probe. In general, hybridization is carried 
out at about 20-25 degrees below Tm (for DNA-DNA hybrids) or 10-15 degrees be- 
low Tm (for RNA-DNA hybrids). 

Exemplary highly stringent conditions may refer, e.g., to washing in 6x SSC/0.05% 
sodium pyrophosphate at 37°C (for about 14-base oligos) # 48°C (for about 17-base 
oligos), 55°C (for about 20-base oligos). and 60°C (for about 23-base ollgos). 

Accordingly, the Invention further provides nucleotide .primers or probes which de- 
tect the r regton polymorphisms of the invention. The assessment may be conducted 
by means of at least one nucleic acid primer or probe, such as a primer or probe of 
DNA, RNA or a nuclefc acid analogue such as peptide nucleic acid (PNA) or locked 
nucleic acid (UNA). The nucleotide primer or probe is preferably capable of hybrid- 
ising to a subsequence of the region corresponding to SEQ ID NO: 1, or a part 
thereof, or a region complementaryto SEQ ID NO:1. 
• 

According to one aspect of the present; invention there is provided an allele-specific 
oligonucleotide probe capable of detecting a r region polymorphism at one or more 
of positions In the r region as defined by the positions In SEQ ID NO- 1 . ■ 
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The allele-speclflc oligonucleotide probe Is' preferably 5-50 nucleotides, more pref- 
erably about 5-35 nucleotides, more preferably about 5-30 nucleotides, more pref- 
erably at least 9 nucleotides. 

5 The design of such probes will be apparent to the molecular biologist of ordinary 
.skill. Such probes are of any convenient length such as up to 50 bases, up to 40 
bases, more conveniently up to 30 bases in length, such as for example 8-25 or 8- 
15 bases In length. In general such probes will comprise bajse sequences entirely 
complementary to the. corresponding wild type or variant locus in the region. How- 
10 ever, If required one or more mismatches may be Introduced, provided that the dis- 
criminatory power of the oligonucleotide probe Is not unduly affected. The probes of 
the invention may carry one or more labels to facilitate detection. 

In one embodiment, the primers and/or probes are capable of hybridizing to a sub- 
15 sequence selected from the group of subsequences below, wherein the polymor- 
phism is denoted as for example T/C: . 

1 . GCTCTGAAAC TTACTAGCCC(A/G)GTATTTATGG AGAGGCATTT 

2. GTGGTCAAAT TCTCATTCAT CGTGG (T/C) CCAGGCAAGC 
20 ACACTTCCTC 

3. ACCCTGAGGT GAGCACCTGT TCCTT(C/T) TCCTTGCCCT TAGCCCA- 
GAG GTAGA 

a GGGCAGGGGT TTGTGCCTCC AATGA (G/A) CACAAGCTCC 
CCCTGCCCCC CAACT 
25 5. CCTGGCGGTG GCCGTCACCA GCTTT (T/C) GGGGGTGTTT 

GGGAAGCTGG 

6. CTCCAGCCCC ACTGTTCCCT (A/G) GGCCCTATTG GTCCCCCTGG 

7. ACAAGGAGGA GGCAGAAGTG AGGTT (G/C) AAACCCACTG CCCAATC- 
TTA 

30 8. CCAACACGGT G AAACCCCGT CTGTA(T/C)TAAAAATACA AAAATTAGCC 

9. AATCCAGGAC CCCATAATCT TQCGT (C/T) ATCT AAAACA ATA- 
ATGGTGA 

10. CCCAAGGGGG CGAGGGGAGG GTGAA (AZG)GGGTGGGACG 
GGGGCAGCCG 



25 




29/04 2003 12: 18 FAX 3332^ J HO I BERG A/S *^ @02S 

P687 DK03 

23 

1 1. GAAGTGAGAA GGGGGCTGGG GGTCG (GA) CGCTCGCTAG 
CGGGCGCGGG 

12. CGCACGCGCA GTATCCCGAT TGGCT (C/G)TGCCCTAGCG GA7T- 
GACGGG 

5 13. AACTCCTGGG TTCGATCAAT ACTCA (GACA/-) ATCTTGGCAG 

GCGCAGGAGG 

14. GCTGGGATTA CAGGCTTGAG CCACC (A/G) CGCCCGGCCT 
GCAAAGCCAT 

15. TTTTGTATCT TTAGTAGAGA CAGG (T/G) TTTCTCCATG TTGGTCAGGC 
10 16. GCCTGAGCCT CCCGAGTAGC TGAGACT (C/A) CAGGTGCCCG CCAC- 

CACGCC 

17. TGAAATTGTA GGTTGAGAGG CCAGGCG (C/T) GGTGCTCACG 
CCTGTAATTT 

18. GTTTATAAACATTAAACCAG (T/A) GCTGTGTGAA GGCACTTAAT 
15 19. CCGTCTCTAT TAAAAATATA AAA (A/C) AATTTAGCCG GGTGTAGCGG 

20. GGGAGGCTCG AGGCGGGC (A/G) GATTGCAT6A GCTCAGGATT 

21 . TCCCAAGTTT CAGGGCCCAA (T/G) ATTCTCAAAT CACAGGATTC 

22. TGCAGTGAGC TGAGATCGC (A/G) CCACTGCACT CCAGCCTGGG 

23. TCTTAGGACG CATGGGGGT (T/G) GAGAGAACGG GGAGATAGAC 
20 . 24. CTGGGTTCTA GAACTACC (C/T) ATGCAAACCC AGCTGTTTCC 

25. ATTCTGCCCT GGGTTCTAGA ACTACCT (C/A) TGCAAACCCA 
GCTGTTTCCC 

26. GCTGTTTCCC ACCCCATAAG GCA (A/G) TAGGGGAGCC 
CACCTCCGCC 

25 27. GACCTAGAAG ATCGGTCGAG A (C/T) AGCAGCTTGA GGCTGGCAGG 

28. CTGGCCAGGA ATGCAGTCGG GTCAC (C/T) CTGTCTAGCC 
ACCGTCTCGC 

29. GGGAGGAGTC GCCGATCAGG (C/T) CCCTTCCTGA AAGTCATCGA 

30. GCAGCCCGGG CTACAGGGTT (A/G) CCTGAGGTGT GGGTCCCAGG 
30 31 . TAGAAATACT AACAAAGGGC (T/C) GTGGGTTTCT CCCCCTGCTT 

32. ACAGGAGAGG GAAGG I 1 1 I 1 1 G (A/T) I i I 1 1 1 1 1 I I G l I I 1 1 1 1 1 I 
33..GAAGAGGAAG AAGCCCAAAG GGA (A/C) AGAAACCTTC-GAGCCA- 
GAAG 

.34. GCGCCTCAAC AGCCAGAAGG AGCG (A/G) AGCCTCAGGC CCAGG- 
35 CAGCT 
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35. TTGAGACTCT CTGTTTGAT (A/G) CTTCACTCAG AAGGTGCTTC 

36. AGGCCAGGCT CCTGCTGGCT G (C/G) GCTGGTGCAG TCTCTGGGGA 

37. CGCCTATACC CTCAAGCAT (C/T) TATCCATTGA GTTACAAACA 

38. ACCATCCCCC GCCTTCCGTT (A/C) GTCCGGCCCC CGAGGCTAGC 

5 

in another embodiment, the primers and/or probes are capable of hybridizing to a 
subsequence selected from the group of subsequences below: 

1 . TGAAATTGTA GGTTGAGAGG CCAGGCG (C/T) GGTGCTCACG 
10 CCTGTAATTT 

2. GTTTATAAAC ATTAAACCAG (T'A) GCTGTGTGAA GGCACTTAAT 

3. CCGTGTCTAT T AAAAATATA AAA (A/C) AATTTAGCCG GGTGTAGCGG 

4. GGGAGGCTCG AGGCGGGC (A/G) GATTGCATGA GCTCAGGATT 

5. TCCCAAGTTT CAGGGCCCAA (T/G) ATTCTCAAAT CACAQGATTC 
15 6. TGCAGTGAGC TGAGATCGC (A/G) CCACTGCACT CCAGCCTGGG 

7. TCTTAGGACG CATGGGGGT (T/G) GAGAGAACGG GGAGATAGAC 

8. CTGGGTTCTA GAACTACC (C/T) ATGCAAACCC AGCTGTTTCC 

9. ATTCTGCCCT GGGTTCTAGA ACTACCT (C/A) TGCAAACCCA 
GCTGTTTCCC 

20 10. GCTGTTTCCC ACCCCATAAG GCA (A/G) TAGGGGAGCC 

CACCTCCGCC 

1 1 . GACCTAGAAG ATCGGTCGAG A (C/T) AGCAGCTTGA GGCTGGCAGG 

12. CTGGCCAGGA ATGCAGTCGG GTCAC (C/T) CTGTCTAGCC 
ACCGTCTCGC 

25 1 3. GGGAGGAGTC GCCGATCAGG (C/T) CCCTTCCTGA AAGTCATCGA 

14. GCAGCCCGGG CTACAGGGTT (A/G) CCTGAGGTGT GGGTCCCAGG 

15. TAGAAATACT AACAAAGGGC (T/C) GTGGGTTTCT CCCCCTGCTT 

16. ACAGGAGAGG GAAGGI 1 1 I I IG (ATT) Til 1 1 1 1 1 1 1 GTTTTTTTTT 

17. GAAGAGGAAG AAGCCCAAAG GGA (A/C) AGAAACCTTC GAGCCA- 
30 GAAG 

. 18. GCGCCTCAAC AGCCAGAAGG A6CG (A/G) AGCCTCAGGC CCAGG- 
CAGCT 

In yet another embodiment, the primers and/or probes are capable of hybridizing to 
35 a subsequence selected from the group of subsequences below 
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1. GTTTATAAAC ATTAAACCAG (T7A) GCTGTGTGAA GGCACTTAAT 

2. CCGTCTCTAT TAAAAATATA AAA (A/C) AATTTAGCCG GGTGTAGCGG 

3. GGGAGGCTCG AGGCGGGC (MG) GATTGCATGA GCTCAGGATT 
5 4. TCCCAAGTTT CAGGGCCCAA (T/G) ATTCTCAAAT CACAGGATTC 

5. TGCAGTGAGC TGAGATCGC (A/G) CCACTGCACT CCAGCCTGGG 

. .It is preferred in one embodiment that at least one sequence polymorphism is as- 
sessed in a region corresponding to SEQ ID NO: 1 position 1521-37752 (r) ( includ- 
10 ing at least one sequence polymoiphism assessed in a region corresponding to 
SEQ ID NO; 1 position 7760-22885. 

In another embodiment, the methods of the Invention relates to at least one se- 
quence polymorphism is assessed in a region corresponding to SEQ ID NO: 1 posi- 
15 tion 34301 -37683, ending with the coding region of ASE-1 (cagcctgtgtag). where tag 
is the stop codon. 

In another embodiment, the method of the invention relates to at least one sequence 
polymorphism assessed in a region corresponding to the S1 as shown in Fig. 4. 

20 

In another embodiment, the method of the invention relates to at least one sequence 
polymorphism assessed in a region corresponding to the S2 as shown in Fig. 4. 

In another embodiment, the method of the invention relates to at least one sequence. 

25 polymorphism assessed in a region corresponding to the S3 as shown in Fig. 4. 
More particular the method of the invention relates to at least, one sequence poly- 
morphism being a deletion assessed in a region corresponding to the S3 as shown 
in Fig. 4, more particular a 4 basepair deletion in a region corresponding to the S3 
as shown in Fig. 4, even more particular a deletion of TGTC in S3 as shown in 

30 Fig. 4. 

In a preferred embodiment the primers or probes are selected from one or more of 
the following: 

35 TGGCTAACACGGTGAAACC(SEQ ID N0:7) 
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GGAATCCAAAGATTCTATGATGG(SEQ ID NO:8) 
GGGAGGCGGAGCTTGCAGTGA (SEQ ID N0:9) 
CTGAGATCGCACCACTGCAC (SEQ ID NO:10) 
GGTTTTCTGCTCTGCACACG (SEQ ID NO:1 1) 
CCTTTCTCCTTCCACCAACG (SEQ ID NO: 12) 
CGGGCTACAGGGTTACCTGAG (SEQ ID NO:1 3) 
TCTGCAACCTGGTGCGAGCAGC (SEQ ID N0:14) 
CCTACCACCATCATCACATCC (SEQ ID NO:15) 
GCCTTGCCAAAAATCATAACC (SEQ ID NO:16) 
CCTCTCCCCAATTAAGTGCCTTCACACAGC (SEQ ID NO:17) 
AGCCAGGGAGGTTGAGGCT (SEQ ID N0:18) 
AGACAGCCCTGAATCAGCAC (SEQ ID NO:19) 
GCAATGAGCCGAGATAGAA (SEQ ID NO:20) 
TGGCTAGCCCATTACTCTA (SEQ ID NO:21) 

According to another aspect of the present invention there is provided a diagnostic 
nucleic acid primer capable of detecting a r region polymorphism at one or mors of 
positions In the r region as defined by the in SEQ ID NO: 1 or the s region as de- 
fined by SEQ ID NO: 2. 

The primer or probe may be a diagnostic nucleic acid primer defined as an allele 
specific primer, used, generally together with a constant primer, in an amplification 
reaction such as a PCR reaction, which provides the discrimination between alleles 
through selective amplification of one allele at a particular sequence position. The 
diagnostic primer is preferably 5-50 nucleotides, more preferably about 5-35 nucleo- 
tides, more preferably about 5-30 nucleotides, more preferably at (east 9 nucleo- 
tides. 

In accordance with the present Invention diagnostic primers are provided, compris- 
ing the sequences set out below as well as derivatives thereof wherein about 6-8 of 
the nucleotides at the 3' terminus are identical to the sequences given below and 
wherein up to 10, such as up to 8, 6, 4. 2. or 1 of the remaining nucleotides may be 
varied without significantly affecting the properties of the diagnostic primer. Con- 
veniently, the sequence of the diagnostic primer Is as written below. 
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Furthermore, as described above at least two sets of primer(s) and/or probe(s) may 
be combined in the method thereby increasing the correlation probability. This sec- 
ond or other set of primers) and/or probe(s) may be a nucleotide or nucleotide 
5 analogues hybridising to a region within the region r or to a sequence different from 
the region r. Said sequence different from the region r is preferably a region In 
chromosome 19. preferably in chromosome 19q. In particular such second or other 
primer or probe may be selected from one or more of the sequences below, or the 
complementary strands: 

10 

GCCCCGTCCCAGQTA (SEQ ID NO:21 ) 

AGCCCCAAGACCCTTTCACT (SEQ ID NO:22) 

GTCCCATAGATAGGAGTGAAAG (SEQ ID NO:23) 

CCCTAGGACACAGGAGCACA (SEQ ID NO:24) 
15 TTGTGCTTTCTCTGTGTCCA (SEQ ID NO:25) 

TATCAGAAAAGGCTGGAGGA (SEQ ID NO:26) 

GAGTGGCTGGGGAGTAGGA (SEQ ID NO:27) 

GCCAAGCAGAAGAGACAAA (SEQ ID NO:2S) 

CCTCAGATGTCCTCTGCTCA (SEQ ID NO:29) 
20 GCCACAGCCCCAGCAAGTAG (SEQ ID NO:30) 

AGGACCACAGGACACGCAGA (SEQ ID NO:31) 

CATAGAACAGTCCAGAACAC (SEQ ID NO:32) 

TTAGCTTGGCACGGCTGTCCAAGGA (SEQ ID NO:33) 

ACAGAATTCGCCCCGGCCTGGTACAC (SEQ ID NO:34) 
25 TTGAAACTGGAACTCTGAGAAGG (SEQ ID NO:35) 

TGGTGGATGGTGTGAAGCA (SEQ ID NO:36) 

CCTTTCTCCAACTTCTTCTCCATTTCCACC (SEQ ID NO:37) 

GGGGATCATGTCGTCAATGGACT (SEQ ID NO:38) 

ATGCCCTGTAGGTTCAATGG (SEQ ID NO:39) 
30 TGGAGGTCTTTAGGGGCTTG (SEQ ID NO:40) 

GGCTGGTCCCCGTCTTCTCCTTCC (SEQ ID NO:41) 

TCTCTGTTGCCACTTCAGCCTC (SEQ ID NO:42) 

GTCCTGCCCTCAGCAAAGAGAA (SEQ ID NO:43) 

TTCTCCTGCGATTAAAGGCTGT (SEQ ID NO:44) 
35 ATCCTGTCCCTACTGGCCATTC (SEQ ID NO:4S) 



30 
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TGTGGACGTQACAGTGAGAAAT (SEQ ID NO:46) 

TGQAGTGCTATGGCACGATCTCT (SEQ ID NO:47) 

CCATGGGCATCAAATTCCTGGGA (SEQ ID NO:48) 

CACACCTGGCTCATTTTTGTAT (SEQ ID NO:49) 
5 . TCATCCAGGTTGTAGATGCCA (SEQ ID NO:50) 

AGGCTCAACAAGGAAAAATGC (SEQ ID NO:51) 

GCTAGACAGTCAAGGAGGGACG (SEQ ID NO:52) 

AAAG6GTGGGTGTGGGAGACATTGG (SEQ ID NO:S3) 

AAACCAACCTAGGCACCCCAAA (SEQ ID NO:54) 
10 CAGTGTCCAAAGAGCACC (SEQ ID NO:55) 

CTACCCCTTTAGCGACC (SEQ ID NO:S6) 

TCCTGCCCCCAGAGCGTCACC (SEQ ID NO:57) 

GTACGGTCCACATAATTTTGGAGGA (SEQ ID NO:58) 

CGACGAACTTCTCTGAAGCGAA (SEQ ID NO:59) 
1 5 AGCGACACGGGCATCTGG (SEQ ID NO:60) 

ATG AGCGTCCACCTCCTGAACC (SEQ ID NO:61 ) 

AGGCAGCAGCATCGTCATCCCC (SEQ ID NO:62) 

TGCATAGCTAGGTCCTGC (SEQ ID NO:63) 

AACTGAGRAAACTAGCTCTATGGGGTGGTGCCGCA (SEQ ID NO:64) 
20 CTGGCTCTGAAACTTACTAGCCC (SEQ ID NO:65) 

GCTGGACTGTCACCGCATG (SEQ ID NO:66) 

GGAGCAGGGTTGGCGTG (SEQ ID NO:67) 

TGCCCTCCCAGAGGTAAGGCCT (SEQ ID NO:68) 

CCCTCCCGGAGGTAAGGCCTC (SEQ ID NO:69) 
25 GATCAAAGAGACAGACGAGC (SEQ ID NO:70) 

GAAGCCCAGGAAATGC (SEQ ID NO:71) 

GGACGCCCACCTGGCCAACC (SEQ ID NO:72) 

CGTGCTGCCCAACGAAGTG (SEQ ID NO:73) 

30 • The primers and probes may be manufactured using any convenient method of 
synthesis. Examples of such methods may be found In standard textbooks, for ex- 
ample "Protocols for Oligonucleotides and Analogues; Synthesis and Properties,'* 
Methods In Molecular Biology Series; volume 20; Ed. Sudhir Agrawal, Humana 
ISBN: 0-89603-247-7; 1993; 1.sup.st Edition. If required the primer(s) and probe(s) 

35 may be labelled to facilitate detection. 
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Kit 

According to another aspect of the present invention, there is provided a diagnostic 
5 kit comprising at least one diagnostic primer of the Invention and/or at least one al- 
lele-spectflc oligonucleotide primer of the invention. 



The diagnostic kits may comprise appropriate packaging and Instructions for use In 
the methods of the invention. Such kits may further comprise appropriate buffers) 
and polymerase^) such as thermostable polymerases, for example taq polymerase. 

Preferred kits can comprise* means for amplifying the relevant sequence such as 
primers, polymerase, deoxynudeotides, buffer, metal ions; and/or means for dis- 
criminating the polymorphism, such as one or a set of probes hybridising to the poly* 
morphic site, a sequence reaction covering the polymorphic site, an enzyme or an 
antibody; and/or a secondary amplification system, such as enzyme-conjugated 
antibodies, or fluorescent antibodies. The krt-of-parts preferably also comprises a 
detection system, such as a fluorometer, a film, an enzyme reagent or another 
highly sensitive detection device. 

The methods described herein may be performed, for example, by utilizing pre- 
packaged diagnostic kits. The Invention therefore also encompasses kits for detect- 
ing the presence of a poiypepfide or nucleic acid of the invention in a biological 
sample (i.e., a test sample). Such kits can be used, e.g., to determine if a subject is 
25 suffering from' or is at increased risk of developing a disorder associated with a dis- 
order-causing allele, or aberrant expression or activity of a polypeptide of the inven- 
tion. For example, the kit can comprise a labeled compound or agent capable of 
detecting the polypeptide or mRNA or DNA or RAI gene sequences, e.g., encoding 
the polypeptide in a biological sample. The kit can further comprise a means for de- 
30 terrnfnlng the amount of the polypeptide or mRNA in the sample (e.g., an antibody 
which binds the polypeptide or an oligonucleotide probe which binds to DNA or 
mRNA. encoding the polypeptide). Kits can also include instructions for observing 
that the tested subject is suffering from or is at risk of developing a disorder associ- 
ated with aberrant expression of the polypeptide if the amount of the polypeptide or 
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mRNA encoding the polypeptide Is above or below a normal level, or if the ONa 
correlates with presence of an RAI allele that causes a disorder. 

For antibody-based kits, the kit can comprise, for example: (1) a first antibody (e.g., 
attached to a solid support) which binds to a polypeptide of the invention; and, op- 
tionally, (2) a second, different antibody which binds to either the polypeptide or to 
the first antibody and is conjugated to a detectable agent 



10 



Identification of an allele as having implication for risk of cancer 



An allele in the s or r region can be identified as correlated with an increased risk of 
developing cancer on the basis of statistical analyses of the incidence of a particular 
allele in two groups of individuals with and without cancer, respectively, according to 
the x 2 test, which is well known in the art. Furthermore, an allele in the region can be 
15 identified as an allele correlated with prognosis of cancer on the basis of statistical 
analyses of the incidence of a particular allele in individuals demonstrating' different 
prognostic characteristics. 

Identification of humans having increased likelihood of responding to treat- 
20 ment 

It is further contemplated that the present invention provides a method for identifying 
a human subject as having an increased likelihood of responding positively to a 
cancer treatment, comprising determining the presence in the subject of a s or r re* 
25 gion allele genotype correlated with an increased likelihood of positive response to 
treatment, whereby the presence of the genotype identifies the subject as having an 
increased likelihood of responding to cancer treatment 

The treatment mentioned herein may be any cancer treatment, such as conventional 
30 cancer treatment, for example X-ray, chemotherapeutics p surgical excision or com- 
binations thereof. 
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Protein Products of theGene(s) 

Gene products of the region s or r or peptide fragments thereof, can be prepared for 
a variety of uses. For example, such gene products, -or peptide fragments thereof. 
5 can be used for the generation of antibodies, in diagnostic assays. 

The gene products of the invention Include, but are not limited to, human RAI gene 
products, and ASE-1 gene products. In the following the Invention is described in 
relation to RAI gene product. 

10 

Gene product, sometimes referred to herein as an "protein" or "polypeptide", in- 
cludes those gene products encoded by the RA) gene sequences shown as position 
7621-21350 in SEQ ID NO; 1. Among gene product variants are gene products 
comprising amino acid residues encoded by the polymorphisms. Such gene product 
1 5 variants also include a variant of the RAI gene product. 

In addition, RAI gene products may include proteins that represent functionally equi- 
valent gene products. In preferred embodiments, such functionally equivalent RAI 
gene products are naturally occurring gene produces. Functionally equivalent RAI 
20 gene products also include gene products that retain at least one of the biological 
activities of the RAI gene products described above, and/or which are recognized by 
and bind to antibodies (polyclonal or monoclonal) directed against RAI gene prod- 
ucts. 

2$ Antibodies to Gene Products 

. Described herein are methods for the production of antibodies capable of specifi- 
cally recognizing one or more gene product epitopes or epitopes of conserved vari- 
ants or peptide fragments of the gene products. Furthermore, antibodies that spe- 
30 crfically recognize mutant forms are encompassed by the invention. The terms "spe- 
cifically bind" and "specifically recognize" refer to antibodies that bind to RAI gene 
product epitopes at a higher affinity than they bind to non-RAI (e.g., random) epi- 
topes. 
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Such antibodies may include, but are not Umited to. polyclonal antibodies, mono- 
clonal antibodies (mAbs), humanized or chimeric antibodies, single chain antibodies, 
Fab fragments, P(ab') 2 fragments, fragments produced by a Fab expression library, 
antHdlotypic (anti-Id) antibodies, and epitope-binding fragments of any of the above, 
5 including the polyclonal and monoclonal antibodies described below. Such antibod- 
ies may be used, for example, in the detection of a gene product in an biological 
sample and may, therefore, be utilized as part of a diagnostic or prognostic tech- 
nique whereby patients may be tested for abnormal levels of gene products, and/or 
for the presence of abnormal forms of such gene products. Such antibodies may 
1Q also be utilized in conjunction with, for example, compound screening schemes, as 
described, below, for the evaluation of the effect of test compounds on gene product 
levels and/or activity. 

For the production of antibodies against a gene product, various host animals may 
15 be Immunized by Injection with a RAI gene product, or a portion thereof. Such host 
animals may include, but are not limited to rabbits, mice, and rate, to name but a 
few. Various adjuvants may be used to increase the Immunological response, de- 
pending on the host species, including but not limited to Freund's (complete and in- 
complete), mineral gels such as aluminum hydroxide, surface active substances 
20 such as lysoleclthin, pluronic polyols, polyanions. peptides, oil emulsions, keyhole 
limpet hemocyanin, dinitrophenol, and potentially useful human adjuvants such as 
BCG (bacille Calmette-Guerin) and Corynebacterium parvum. 

Polyclonal antibodies are heterogeneous populations of antibody molecules derived 
25 from the sera of animals immunized with an antigen, such as a gene product, or an 
antigenic functional derivative thereof. For the production of polyclonal antibodies, 
host animate such as those described above, may be immunized by injection with 
gene product supplemented with adjuvants as also described above. 

30 . Monoclonal antibodies, which are homogeneous populations of antibodies to a par- 
ticular antigen, may be obtained by any technique that provides for the production of 
antibody molecules by continuous cell lines in culture. These include, but are not 
limited to, the hybridoma technique of Kohler and Milstein. (1975, Nature 256:495- 
497; end U.S. Pat No. 4,376,1 10), the human B-cell hybridoma technique (Kosbor 

35 et aL, 1983, Immunology Today 4:72; Cole et al. f 19B3, Proc. Natl. Acad. Sci. U.S.A. 
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60:2026-2030), and the EBV-hybridoma technique (Cole et a!., 1985, Monoclonal 
Antibodies And Cancer Therapy, Alan R. Liss, Inc., pp. 77-98). Such antibodies may 
be of any immunoglobulin class Including IgQ, IgM, IgE, IgA, IgD and any subclass 
thereof. The hybridoma producing the mAb of this invention may be cultivated in 
5 vitro or in vivo. Production -of high titers of mAbs in vivo makes this the presently 
preferred method of production. 

In addition, techniques developed for the production of "chimeric antibodies" (Morri- 
son, et ah, 1984, Proc. Natl. Acad. Sci., 81:6851-6855; Neuberger, et aL, 1984, Na- 

10 ture 312:604-608: Takeda, et al., 1985, Nature, 314:452-454) by splicing the genes 
from a mouse antibody molecule of appropriate antigen specificity together with 
genes from a human antibody molecure of appropriate biological activity can be 
used. A chimeric antibody is a molecule in which different portions are derived from 
different animal species, such as those having a variable region derived from a 

15 murine mAb and a human Immunoglobulin constant region. (See, e.g., Cabilly et al., 
U.S. Pat. No. 4,816,567: and Boss et al M U.S. Pat. No. 4,816397, which are incorpo- 
rated herein by reference in their entirety.) 

In addition, techniques have been developed for the production of humanized anti* 
20 bodies. (See, e.g., Queen, U.S. Pat. No. 5,585,089, which Is incorporated herein by 
reference in Its entirety.) An immunoglobulin light or heavy chain variable region 
consists of a "framework" region Interrupted by three hypervariable regions, referred 
to as complementarity determining regions (CDRs). The extent of the framework 
region and CDRs have been precisely defined (see, "Sequences of Proteins of Inv 
' 25 munological Interest", Kabat, E. et al., U.S. Department of Health and Human Serv- 
ices (1 983) ). Briefly, humanized antibodies are antibody molecules from non-human 
species having one or more CDRs from the non-human species and a framework 
region from a human Immunoglobulin molecule. 

30 Alternatively, techniques described for the production of single chain antibodies 
(U.S. Pat. No. 4,946,778; Bird, 1988, Science 242:423-426; Huston, et 'aL. 1988, 
Proc. Natl. Acad. Sci. U.S.A. 85:5879-5883; and Ward, et al.,'1989, Nature 334:544- 
546) can be adapted to produce single chain antibodies against gene products. Sin- 
gle chain antibodies are formed by linking the heavy and light chain fragments of the 

35 Fv region via an amino acid bridge, resulting in a single chain polypeptide. 
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Antibody fragments that recognize specific epitopes may be generated by known 
techniques. For example, such fragments include but are not limited to: the F(ab')z 
fragments, which can be produced by pepsin digestion of the antibody molecule and 
5 the Fab fragments, which can be generated by reducing the disulfide bridges of the 
F(ab')2 fragments. Alternatively, Fab expression libraries may be constructed (Huse, 
et al., 1989, Science 246:1275-1281) to allow rapid and easy identification of mono- 
clonal Fab fragments with the desired specificity. 

10 Immunoassays for gene products, conserved variants, or peptide fragments thereof 
will typically comprise incubating a sample, such as a biological fluid, a tissue ex- 
tract, freshly harvested ceils, or lysates of cells in the presence of a detectably la- 
beled antibody capable of identifying gene product, conserved variants or peptide 
fragments thereof, and detecting the bound antibody by any of a number of tech- 

1 5 niques well-known in the art. 

The biological sample may be brought in contact with and immobilized onto a solid 
phase support or carrier, such as nitrocellulose, that is capable of immobilizing cells, 
cell particles or soluble proteins. The support may then be washed with suitable 
20 buffers followed by treatment with the detectably labeled gene product specific anti- 
body. The solid phase support may then be washed with the buffer a second time to 
remove unbound antibody. The amount of bound label on the solid support may 
then be detected by conventional means. 

25 By "solid phase support or carrier" is intended any support capable of binding an 
antigen or an antibody. Well-known supports or carriers include glass, polystyrene, 
polypropylene, polyethylene, dextran, nylon, amylases, natural and modified cellulo- 
ses, polyacrytamides. gabbros. and magnetite. The nature of the carrier can be ei- 
ther soluble to some extent or insoluble for the purposes of the present Invention. 

30 The support material may have virtually any possible structural configuration so long 
as the coupled molecule is capable of binding to an antigen or antibody. Thus, the 
support configuration may be spherical, as in a bead, or cylindrical, as in the inside 
surface of a test tube, or the external surface of a rod. Alternatively, the surface may 
be fiat such as a sheet, test strip, etc. Preferred supports include polystyrene beads. 
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Those skilled in the art will know many other suitable carriers for binding antibody or 
antigen, or will be able to ascertain the same by use of routine experimentation. 

One of the ways in which the RAI gene product-specific antibody can be detectably 
5 labeled Is by linking the same to an enzyme, malete dehydrogenase, staphylococcal 
nuclease, delta-5-sterofd Isomerase. yeast alcohol dehydrogenase, a-gtycero- 
phosphate, dehydrogenase, triose phosphate isomerase, horseradish peroxidase, 
alkaline phosphatase, asparaginase, glucose oxidase, p-galactosidase, ribonucle- 
ase. urease, cetalase, glucose-6-phosphate dehydrogenase, glucoamylase and 
10 acetylcholinesterase. The detection can be accomplished by cofoiimetric methods 
that employ a chromogenlc substrate for the enzyme. Detection may also be ac- 
complished by visual comparison of the extent of enzymatic reaction of a substrate 
in comparison with similarly prepared standards. 

15 Detection may also be accomplished using any of a variety of other Immunoassays. 
For example, by radioacfively labeling the antibodies or antibody fragments, by la- 
beling the antibody with a fluorescent compound. Among the most commonly used 
fluorescent labeling compounds are fluorescein isothlocyanate, rhodamine, phyco- 
erythrin, phycocyanln, allophycocyanfn, o-phthaldehyde and fluorescamine. 

20 

The antibody can also be detectably labeled using fluorescence emitting metals 
such as 1S2 Eu, or others of the lanthanide series or by coupling it to a chemliumines- 
cent compound. 

25 Diseases 

Described herein are various applications of gene sequences, gene products, in- 
cluding peptide fragments and fusion proteins thereof, and of antibodies directed 
against gene products and peptide fragments thereof. Such applications include, for 
30 example, prognostic and diagnostic evaluation of cancer and the identification of 
subjects with a predisposition to such disorders, as described above. 

The method according to the invention may be used in relation to any cancer form, 
such as, but not limited to. skin carcinoma including malignant melanoma, breast 
35 cancer, lung cancer, colon cancer and other cancers in the gastrointestinal tract, 



38 



29/04 2003 12:22 FAX 33320^ HO I BERG A/S 3^K5r B|039 

P687DK03 

36 

prostate cancer, lymphoma, leukemia, pancreas cancer, head and neck cancer, 
ovary cancer and other gynecological cancers. In particular the method Is relevant 
for skin cancer, lung cancer, colon cancer and breast cancer, such as skin cancer 
and breast cancer, preferably wherein the skin cancer is basal oell carcinoma. 

5 

In particular, the method (s relevant for early age cancer, such as early age breast 
cancer. 

Gene nucleic acid sequences, described above, can be utilized for transferring re- 
10 combinant nucleic acid sequences to cells and expressing said sequences In recipi- 
ent cells. Such techniques can be used, for example, in marking cells or for the 
treatment of cancer. Such treatment can be In the form of gene replacement ther- 
apy. Specifically, one or more copies of a normal RAJ gene or a portion of the RAI 
gene that directs the production of an RAI gene product exhibiting normal RAI gene 
15 function, may be inserted into the appropriate cells within a patient, using vectors 
that include, but are not limited to, adenovirus, adeno-associated virus, and retrovi- 
rus vectors, In addition to other particles that introduce DNA into cells, such as lipo- 
somes. 

20 Examples 

The examples relate to prediction from sequence polymorphisms in the region s or r 
to cancer. Blood was collected before (exampe 6) or after (examples 1 through 5) 
the persons acquired cancer. However, the sampling time is considered Immaterial, 
25 as DNA in a polyclonal blood Sample is not expected to change over time. 

The particular sequence polymorphisms analysed in these examples are listed in 
Table 6, together with their sources of information and their definition as sequences. 
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Table 6. The markers used, their sources of Information, and their currently esti- 
mated positions on chromosome 19, as wen as their posltton in figure 2. 



Name 


Source of Position in 


GenBank Acces. 




~UOlllUl I 




Identification 


sequence 

• 


sion 

Number of se. 
que nee 


rUdlUUIl 


in ngur© 


XRCC1 e10 


Ref 1 




L3407Q 




4 
1 


CKM €8 








Ol .OO 1 


2 


XPD e23 








Dl.479 


3 


XPOe10 


Ref. 1 


23591 




61.491 


4 


XPOe6 


Ref. 1 


22541 


L47234 


62.4923 


5 


XPD 14 


rs#1618536 


19244 


L47234 


61.4924 


6 


RAleO 


rs#6968 


8786 


L47234 


61.506 


7 


RAI h 


rs#1 970764 


875 


L47234 


61.514 


8 


ASE1 e1 


ra#967591 


232125 


NT_011242 


61.534 


9 


ERCC1 e4 


Ref.1 


19007 


M63796 


61.547 


10 


FOSBb4 


rs#1 049698 


34621 


M89651 


61.601 


11 


SLC1A5e8 


rs#1 060043 


60620 


AC008622 


62.946 


12 


GLTSCR1 o1 


rs#1035938 


20775 


AC010519 


63.986 


13 


LlG1e6 


rs#20580 


111 


L27710 


65.460 


.14 



rs numbers were derived from the NCBI's database dbSNP. 
Ref 1; Shen, M.R., Jones. I.M., and Mohrenweiser. H. (1998) Nonconservative 
amino acid substitution variants exist at polymorphic frequency in DNA repair genes 
In healthy humans. Cancer Res., 58: 604-8, 1998. 



MATERIALS AND METHODS 

Study groups. The groups of Caucasian Americans with and without BCC have 
been described previously (Athas et al, Cancer Res. 51:5786-5793, 1991; Wei et al, 
Proc. Natl. Acad. Sci USA. 90: 1614-8, 1994). Briefly, the study was a clinic based 
case control study at the Johns Hopkins Hospital, which serves multiple participating 
dermatologists in Maryland. Cases were histo-pathologically confirmed primary 
BCCs and were diagnosed between-1987-1990. The controls were patients from the 
same physician practices and had a diagnosis of mild skin disorders. All participants 
were Caucasians living near Baltimore and were between 20 and 60 years of age. 
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The controls were frequency matched to the cases by age and sex. Cases and con- 
trols with any other forms of cancer were excluded. In the questionnaire, the study 
subjects were asked if they had any blood relatives with skin cancer, and were 
asked to specify the type of cancer. Study subjects with relatives with basal cell car- 
5 cinoma and squamous cell carcinoma and 'skin cancer 1 were included in the group 
of subjects with a family of skin cancer. Subjects with relatives with melanoma were 
not included. At the clinic visit the subjects gave informed consent, were examined 
by dermatologists, completed a structured questionnaire and provided blood. DNAs 
from available frozen lymphocytes were purified using Puregene (Gentra Systems) 
10 and were genotyped. Initially, 71 cases and 118 controls were included in this study. 
However, the number of persons varied between analyses, as the -supply of DNAs 
.gradually was depleted. In case of the SNP RAI 11 only 133 persons could be geno- 
typed reliably. 

15 The groups of 20 psoriatic Danes with end 20 psoriatic Danes without BCC have 
been described previously (Dybdahl et al. Cancer Epidemiol. Blomarkers Prev. a 
8:77-81, 1999). Briefly, BCC subjects were Identified from a population-based cohort 
of persons treated by Danish dermatologists in the year 1995, and fulfilled the fol- 
lowing criteria (a) age fn 1995 < 50 years; and (b) clinically verified diagnosis of pso- 

20 rlasls. The diagnosis of BCC was clinically and histologically confirmed. The controls 
consisting of psoriasis cases without BCC was selected from among patients treated 
In the year 1992-1995 for psoriasis by dermatologists who participated in the na- 
tional cohort study 1995. The controls were matched by age and sex. The patients 
with psoriasis and BCC differed from the national cohort of BCC in that the average 

25 of first BCC was 38 year against 56 year in the cohort. A number of cases had had 
multiple BCCs. There was a tendency that cases had been treated for a longer time 
than the controls, and also that the treatments were more intense. This was to be 
expected as treatment of psoriasis involves a number of carcinogenic treatment mo- 
dalities. DNAs from available frozen lymphocytes were purified using Puregene 

30 (Gentra Systems) and were genotyped. 

Primers and probes. Table 7 includes the polymorphisms typed on LIghtcycler™, the 
primers used for the PCR reaction and the probes used for detection and typing of 
the PCR products. Table 8 lists the polymorphisms typed by conventional PCR- 
35 RFLP, and the primers and restriction enzymes used. Table 9 lists the polymor- 



41 





29/04 2003 12:23 PAX 33321^ HO I BERG A/S + 1 ®042 

P 667 DK03 



phfsms typed by SNaPshot technology and the primers used. Table 10 lists the poly- 
morphisms analyzed on a Taqman, and the primers and probes used. Hobolth DNA, 
Hillerod, Denmark or DNA Technology, Aarhus, OenmarK, synthesized the primers 
in tables 7, 8. and 9. TIB Mol-BIol, Berlin, Germany, synthesized the Ughtcycier 
probes. TAG-Copenhagen ApS (Tagc.com, Copenhagen, Denmark) synthesized the • 
primers, and Applied Blosystem synthesized the fluorescent Taqman probes In table 
10. 

Table 7. Design of primers and fluonogenic probes for DghtCycier " 
ASE1 61 

Forward primer 5 -GGTTTTCTGCTCTGCACACG 
Reverse primer S'-CCTTTCTCCTTCCACCAACG 
Ahchor probe: S'-TCTGCAACCTGGTGCGAGCAGC-Fliiorescein 
Sensor probe; 5'-LCRed640-CGGGCTACAGGGTTACCTGAG-p 
CKMeB 

Forward primer: 5-TTGAAACTGGAACTCTGAGAAGG 

Reverse primer 5-TGGTGGATGGTGTGAAGCA 

Anchor probe: 5*-LC Red 640- 
CCTTTCTCCAACTTCTTCTCCATTTCCACO-p 

Sensor probe: 5-GGGGATCATGTCGTCAATGGACT-FIUorescein 
ERCC1 e4 

Forward primer: 5-AGGACCACAGGACACGCAGA-3' 

Reverse primer: 5-CATAGAACAGTCCAGAACAC-3 5 

Anchor probe: 5'-LCRed640-TGGCGACGTAATTCCCGACTATGTGCTG p- 

3' 

Sensor probe: 5'-CGCAACGTGCCCTGGGAAT-Fluorescefn 
FOS8e4 

Forward primer: 5'-AGGCTCAACAAGGAAAAATGC 
Reverse primer 5'-GCTAGACAGTCAAGGAGGGACG 
Anchor probe: ff-LCRed 640-AAAGGGTGGGTGTGGGAGACATTGG-p 
Sensor probe: 5-AAACCAACCTAGGCACCCCAAA-Fluorescein 
GLTSCR1 e1 

Forward primer; 5'-CGACGAACTTCTCTGAAGCGAA 

Reverse primer S'-AGCGACACGGGCATCTGG 

Anchor probe: ff-ATGAGCGTCCACCTCCTGAACC-fiuorescein 
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Sensor probe: S'-LCRed 640-AGGCAGCAGCATCGTCATCCCC-p 
UG1 eS 

Forward primer 5-ATGCCCTGTAGGTTCAATGG 
Reverse primer: 5-TGGAGGTCTTTAGGGGCTTG 
Anchor probe: S'-GGCTGGTCCCCGTCTTCTCCTTCC-Fluoreacein 
Sensor probe: 5 -LC Red 640-TCTCTGTTGCCACTTCAGCCTC-p 
RAI H 

Forward primer 5-TGGCTAACACGGTGAAACC 
Reverse primer 5-GGAATCCAAAGATTCTATGATGG 
Anchor probe: 5-GGGAGGCGGAGCTTGCAGTGA-Fluoresceln 
Sensor probe: 5-LCRed 640-CTGAGATCGCACCACTGCAC-p 
SLC1A5eB 

Forward primer 5'-CAGTGTCCAAAGAGCACC 
Reverse primer 5'-CTACCCC"TTTAGCGACC 
Anchor probe: S'-LCRed 640-TCCTGCCCCCAGAGCGTCACC-p 
Sensor probe: 5-GTACGGTCCACATAATTTTGGAGGA-Flucrescein 
XPD &10 

Forward primer 5X3ATCAAAGAGACAGACGAGC 
Reverse primer 5'-GAAGCCCAGGAAATGC 
Anchor probe: S'-GGACGCCCACCTGGCCAACOFIuoresceln 
Sensor probe: 5'-LCRed640-CGTGCTGCCCAACGAAGTG-p 
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Table 8. Primers and restriction enzymes used for typing of SNPs using PCR- 
RFLP 



Gene exon Primers Enzyme 

Fragments 

XRCC1 exon 10 TTGTGCTTTCTCTGTGTCCA Mspl 240,375bp(A) 

TATCAGAAAAGGCTGGAGGA 815bp (G) 

ERCC1 exon4 AGGACCACAGGACACGCAGA BsiOt 157, 368bp (A); 

• CATAGAACAGTCCAGAACAO 525bp(G) 
XPO oxo/?6 1.5©t CACACCTGGCTC Al 1 1 1 1 GTAT Tfh 
TCATCCAGGTTGTAGATGCCA 
2.aet TGGAGTGCTATGGCACGATCTCT 77B 68. 1 14. 482 bp (A); 

CCATGGGCATCAAATTCCTGGGA 68. 598 bp (C) 

XPDexcn23 1set GTCCTGCCCTCAGCAAAGAGAA 
TTCTCCTGCGATTAAAGGCTGT 
. ATCCTGTCCCTACTGGCCATTC Psfl 66, 100. 158(C); 

TGTQAACGTGACA GTGAGAAAT 100. 224 (A) 

table 9. Design of primers and SNaPshot primers for SNaPshot typing on 
eequenator. 

XRCC1 exon7 ™ . 

Forward primer: ff-GTCCCATAGATAGGAGTGAAAG 

Reverse primer: 5'-CCCTAGGACACAGGAGCACA 

SNaPshot primer: 5-TGCATAGCTAGGTCCTGC 
XRCC1 exon17 

Forward primen 5'-GCCAAGCAGAAGAGACAAA 

Reverse primer 5 -GAGTGGCTGGGGAGTAGGA 

SNaPshot primer: 

5-AACTGACRAAACTAGCTCTATGGGGTGGTGCCGCA 
RAI exon6 

Forward primer: S'-CCTACCACCATCATCACATCC 
Reverse primer: 5-GCCTTGCGAAAAATCATAACC 
SNaPshot primer: ff-CCTCTCCCCAATTAAGTGCCTTCACACAGC 
XPD intron4 

Forward primer: S-CGCAAAAACTTGTGTATTCACC 
Reverse primenS'-CCCATTTTTATCATCAGCAACC 
SNaPshot primen 5-CTGGCTCTGAAACTTACTAGCCC 
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Table 1 0. Design of primers and probes for Taqman. 

XRCC1 exonIO 

Forward primer. 5'-GCT GGA CTG TCA CCG CAT G 

Reverse Primer. 5\GGA GCA GGG TTG GCG TG 

Probe (A): 5'Fam- TGC CCT CCC AGA GGT AAG GCC T -Tamia 

Probe (G): 5Vic - CCC TCC CGG AGG TAA GGC CTC -Tamra 



Determination of polymorphisms by LightCycler. Genotypes of the American persons 
forpofymorphlsmsin ASE-1e1, CKMe8. ERCC1e4, FOSBe4. GLTSCR1e1. LIG1e6, 
5 RAIi1» SLC1 A5e8 and XPDe10 and of the Danish persons for polymorphisms ASE- 
1e1, CKMe8, FOSBe4, LIG1e6 and SLC1A5e8 were detected using LightCycler™ 
(Roche Molecular Biochemlcals, Mannheim, Germany). PCR was performed by 
rapid-cycling in a reaction volume of 20 \i\ with 0.5 \M of each primer, 0.045 iiM of 
anchor and sensor probe, 3.5 mM MgCfe approximately 7 - 25 ng genomic DNA. 
10 and 2 (il LightCycler DNA Master Hybridization probe buffer (Roche Molecular Bio- 
chemicals, Cat. No 2158 825). This buffer contains Taq DNA polymerase, dNTP 
mix, and 10 mM MgCI* In some cases the reaction mixture also contained 5% 
DMSO. The temperature cycling consisted of denaturatlon at 95°C for 2 sec, fol- 
lowed by 46 cycles consisting of 2 sec at 95°C, 10 sec at 57°C, and 30 sec at 72°C 
15 The last annealing period at 72°C was extended to 120 sec. The melting profile was 
determined by a temperature ramp from 50°C to 95 6 C with a rate of 0.1 degree/sec. 
For RAH2 the melting profile was run 3 times, and the last curve was used, 

PCR-RFLP analyses. Genotypes of the American persons for polymorphisms in 
XPDeS and XPDe23 and of Danish psoriatics for polymorphisms In XRCC1e10, 
ERCC1e4, XPDe6, and XPDe23 were detected using PCR-RFLP technique (Shen 
et a! see above; Dybdahl et al, see above; Vogel etal. Cancer Epidemiol. Blomark- 
ers Prev., 8:77-81 (2001)). The reactions were performed as reported (Shen et al, 
see above; Dybdahl et al, see above; Vogel et al. Cancer Epidemiol. Biomarkers 
Prev.. 8:77-81 (2001)). 

Determination of polymorphisms by SNaPshot technique on sequenator. The poly- 
morphisms in RAIeS, XPDI4, XRCC1e7, and XRCC1e17 in the American persons 
were typed simultaneously on an ABI Prism 310 sequenator (Applied Biosystems, 
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Foster City, CA, USA) using SNaPshot technique (Undblad-Toh et al, Nature Ge- 
netics. 24; 381-6, 2000.). The PCR reaction consisted of 1 pi of purified genomic 
DNA, 1 pmole of each primer (DNA Technology. Aarhus Denmark), 12.5 nmole of 
each dNTP (Bioline, London, UK), 100 nmote MgCfe (Bfoline), 0.15 pi BIOTAQ™ 
5 DNA Polymerase (Biofine) in a total volume of 20 pi of water. The program con- 
sisted of 4 min at 96°C, followed by 25 cycles of 96°C for 30 sec, 60°C for 30 sec, 
and 72°C for 60 sec. The last cycle was followed by 72°C for 6 min. The primers 
and dNTPs were removed in reactions containing 2 U Shrimp Alkaline Phosphatase 
(SAP) (Roche), 2 U Exonuclease I (Blolabs, Denmark), and 9 pi PCR reaction in a 

10 total volume of 14 pi water. The reactions were incubated at 37°C for 00 min and 
72 C C for 15 min. The SNaPshot reactions contained 1 pi of SNaPshot Ready Reac- 
tion Mix (Applied Biosystems), 0,5 pi of each SNaPshot primers (XRCCe7-ss1; 
4pmol/pl, XPDi5-cp1; 0,5pmoI/Ml. RAIe7-cp1; IpmoVpl; XRCCe17-ss1; 2pmol/pl), 
2 pi of the purified PCR product, and 1.5 pi of buffer (200 mM Tris-HCI, 5 mM MgCI 2 , 

15 pH 9.0). The reactions were cycled 25 times: 96 a C for 10s, 50°C for 5s, and 60°C 
for 30s. The primere and dNTPs were removed in a reaction containing 1 U SAP, 
0.8 pi 10xSAP buffer, and 5 p! SNaPshot reaction in a total volume of 8 pi of water. 
Two pi purified product was added to 10 pi of concentrated deionized formamide 
(Amresoo, Ohio, USA), incubated for 5 min at 95°C. and analyzed on the sequena- 

20 tor. The two markers in XRCC1, in exon 7 and exon 17, could not be reliably scored 
and thus were excluded from further consideration. 

Determination of polymorphisms by real-time PCR using Tsqman probes. The poly- 
morphism in XRCCIeiO in the American persons was analysed using the ABI Prism 

25 7700 sequence detection system (Applied Biosystems. Foster City. Ca, USA). PCR 
Primers and Taqman probes were designed using Primer Express v 1.0 (Applied 
Biosystems). The reactions were performed In MlcroAmp optical tubes sealed with 
MicroAmp optical caps (Applied Biosystems) containing a 10 pi reaction volume: 1x 
Taqman buffer A, 2.5mM MgCI 2l 200 pM each of dATP dCTP, dGTP, .400pM dUTP. 

30 800nM each primer. 200nm each probe, 0,01 U/pL AmpErase UNG, 0,025 U/pL 
AmpliTaq Gold Polymerase. Thermal cycler conditions were: Tubes were Incubated 
at 50°C for 2 min fallowed 10 min at 95°C. The incubation was succeeded by 45 
cycles of 95°C for 15 sec and 64'C for 1 min. 
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Example 1 

ONA from humans from the American cohort of patients with basal cell carcinoma 
and controls, described In Materials and Methods, was typed with respect to a num- 
5 ber of sequence polymorphisms located in and around the claimed region r. The 
resulting statistical p-values for association of occurrence of the Individual sequence 
polymorphisms with the status of patients are depicted in Figure 2. Also depicted are 
the calculated odds ratios for association of sequence polymorphism and disease. 
For the calculation of the odds ratios the heterozygote genotypes were combined 
10 with the fesser group of homozygotes, and the ordering of the groups was chosen 
such that the odds ratio became more than or equal to 1. The results show that the 
sequence polymorphism RAII1 is strongly associated with disease in this cohort (p = 

0. 004). Bonferionl correction for the number of tests made Indicates that a result 
less than 0.007 must be considered significant at a level of 0.05. Thus, even after 

1 5 correction for multiplicity of testing this result is significant 

The numbers next to the points in the curves are merely a help to Identify the single 
sequence polymorphisms: 

1, Xr1e10; 2, CKMeS; 3, XPDe23; 4> XPDe10; 5. XPDe6; 6, XPDW; 7, RAIe6; 8. 
20 RAM; 9, ASE-1e3; 10, ERCC1e4; 11, FOSBe4; 12. SLC1A5eS; 13, GLTSCR1e1; 

14 f LIG1e6. 
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Example 2 

Those persons in Example 1 who got basal cell carcinoma before the age of SO 
5 . years were selected, and the results from analysis of RAM were compared with the 
status of the patients. There was a strong relationship between the occurrence of 
the individual genotypes of the sequence polymorphism and the status of the pa- 
tients (Table 11; Odds ratio = 12.3; p& 2 ) = 0.00014). 

1 0 Table 1 1 . Occurrences of genotype for the sequence polymorphism RA! i1 in Ameri- 
can cases with Basal cell carcinoma occurring before 50 years of age and in con- 
trols. 



15 



RAIil genotypes 


Number of cases 


Number of controls 




before 50 years of age 




AA 


31 


44 


AG 


2 


32 


GG 


0 


5 



Example 3 



The data of Example 2 were combined with results of genotyping the neighbouring 
sequence polymorphism RAIeG. There was a very strong association between the 
20 combined genotypes of RAII1 and RAleO and the status of the patients. Thus, al- 
most all American cases occurring before the age of 50 yrs were homozygote for 
RAI I1 A RAI e6 A , while only approximately half of the controls were so (Table 12, 
Odds ratio = 12.8; p(x*) = 0.00006). 
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Table 12. Com.blned occurrences of different genotypes for the sequence polymor- 
phisms RAH2 and RAIe6 In American cases occurring before 50 years of -age and in 
controls. 





RAII1 


RAIeS 


AA 


AG 


GQ 


BCC cases 


AA 


30 


0 


0 




AT 


0 


2 


0 




TT 


0 


0 


0 


Controls 


AA 


42 


10 


1 




AT 


2 


21 


0 




TT 


1 


0 


2 



Example 4 

10 The data of Example 2 were combined with results of genotyplng the sequence 
polymorphism GLTSCR1e1 located outside the claimed region r. There was a very 
strong association between the combined genotypes of RAM and GLTSCR1e1 and 
the status of the patients. It was obvious to define "risk-genotypes" as having two As 
in RASil and at least one C in GLTSCRIel. This corresponds to the assumptions 

15 that RAM* is recessive, and GLTSCR1e1 c is dominant If one does so f one finds 
that 25 out of 25 cases have a "risk-genotype", while only 26 out of 62 controls have 
one (Table 13; Odds ratio > 30; pfc 2 ) = 0.000002). 
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Table 13. Combined occurrences of genotypes for the sequence polymorphisms 
RAM and GLTSCR1e1 In American cases of basal cell carcinoma occurring before 
50 years of age and In controls. 





RAM 


GL.TSCR.1e1 


AA 1 


AG 


GG 


BCC cases 


CC 


17 


0 


0 




CT 


a 


0 


0 




TT 


0 


0 


0 


Controls 


CC 


15 


18 


3 




CT 


13 


7 


0 




TT 


3 


3 


0 



10 



15 



Example 5 

DNA from humans from the cohort of Danish psoriatics with basal celt carcinoma 
and controls, described In Materials and Methods, was typed with respect to a num- 
ber of sequence polymorphisms located in and around the claimed region r. The 
resulting statistical p-values for association of occurrence of the Individual sequence 
polymorphisms with the status of patients are depicted in Figure 3. The results show 
that the sequence polymorphism ERCC1e4 is strongly associated with disease in 
this cohort (p = 0.01). 

Example 6 



Blood samples were collected from a large number of Danish citizens and frozen. 
After a number of years the women who got breast cancer in the Intervening period 

20 were identified, as well as a set of matching controls. DNAs were purified from the 
blood samples of these persons and a number of polymorphisms, namely RAIi1 t 
ASE-1e3 and ERCC1e4, in the region of interest were typed. The polymorphisms 
were subsequently combined such that the high-risk group was homozygous for the 
high-risk alleles of all three polymorphisms: RAli1 AA ASE-1e3 GC ERCC1e4 AA .» All other 

25 genotypes were combined into the low-risk group (Table 14; OR = 1.59; .pfe 2 ) - 
0.004). 
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Table 14. Occurrence of a combined Tilgh-risk* genotype RAIM^ASE- 
1e3 GQ ERCC1e4 AA as opposed to all other combinations of genotypes for the se- 
quence polymorphisms Rami, ASE-e3 and ERCC1e4 in Danish cases of breast 
cancer and controls. 



-10 





High-risk 


Low-risk 


Cases 
Controls 


120 
277 


65 
312 



The DNAs In these examples were purified from available frozen lymphocytes using 
Puregene (Gentra Systems). A variety of other ways of purifying DNA is available to 
the expert and woutd also be expected to lead to the wanted results. 

Analysis of sequence polymorphisms can be performed with a variety of techniques, 
some of which have been used In the examples of this application. Most often a 
number of techniques can produce the wanted result. 



15 Similarly, the choice of primers and probes in a particular assay is to some extent 
free and other primers and probes might well produce similar results. 

Finally, it is to be expected that assays for other sequence polymorphisms in the 
region of interest may produce roughly similar results. Our particular choice of se- 

20 quence polymorphisms and assays used in the examples are thus noi intended to 
limit our claims. Thus, at present about 30 SNPs within the region r are listed in 
NCBIs database dbSNP including rs#2070830. rs#2017104, re#2017154 and 
rs#2377328 l all within or very close to RAI. Other forms of polymorphisms such as 
the tandem repeat polymorphisms D19S543 and D19S393 are also known to occur 

25 in the region and can probably serve as ifiarkers in the present invention. Moreover, 
it is very likely that the region contains a number of as yet undiscovered polymor- 
phisms. For instance, the sequence of the 5' half of RAI and its upstream promoter 
region is currently only a draft version and new polymorphisms of potential use for 
this invention are lijcely to be uncovered as more- sequence reads of this segment 

30 are produced. 
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Sequence of the r region of chromosome 19 

The following depicts the region r stretching from the beginning of, but not including 
the XPD gene, to approximately the end of ERCC1, and includes the genes RAI, 
5 LOC1 62978, and ASE-1. More specifically r Is bounded by and includes the follow- 
ing two sequences: AGAACCCCCG CCCCTCCACC TCQTCTCAAA and 
TCCCTCCCCA QAGACTGCAC CAGCGCAGCC, and is defined by SEQ ID NO: 1 
heroin below: 

1 0 AGAACCCCCGCCCCTCCACCTCGTCTCAAAAAAAAAAAAAAATCGTCTCAGTAGCGAr 
ATAGTCTAACGGAGAATGACAGGGAAATTGGTGATCCTTTCTGGGCCCAAGAGTTA- 
GAAATGGCTTTGCAGGCCGGGCGCGGTGGCTCAAGCCTGTAATCCGAGCACTTTGG^ 
GAGGCTGAGGCAGGTGGATCACCTGAGGTCGGGAGTTCAAGACCAGCCTGACCAAp 
GATGGAGAAAACCTGTCTCTACTAAAGATACAAAATTAGCCGGGCGTGCTGGCAAATG- 

1 5 CTrGTAATCCCAGCTACTCGGGAGGCTGAAGCAGGAGAATTGCTTGAACCTGGGAGG- 
CAGAGGTTGCAGTGAGCAGAGATGGCGCCGTCGCACTCTAGCCTGGGCAACAAAAG- 
CGAAACTCCATTTCAAATATTAATAATAATAACTAATAAATAAAACATAAATGCTAGC7T- 
TrGTTTGTTTCTTCAACAAATAGCTATGTGGCATCTACCATGTGTCTGATCCTGTGCT- 
G GCCCCTG GGAACAGAAAQGTGACCATGACAGCCTCAGCACCTGCCCTCAAAGAACA- 

20 GATTTTrn*CCTTGAGACAGGGTCTTTCTCTGTCGCCAAGGCTGGAGTGCAGTGGCA- 
CAGTCACAGCTCACTGCAGCCTCCACCTCTTGGGCTCAAGCGATCCTCCCACCTCAG- 
CTTC CAG^GTA GCTGGGAGCACAGGTGTGCACCACCAAGCCCAGCTAAQTTTTATTTT' 
TTAAATTTTTTTAGAGACGAGGTCTCACCACGTTGCCCAGGCTGGTTAAACTCGCAG- 
GTTCAAGTGATCGTCTCCCCTCAGCCTTTCAAATTGTTGGGATTACAGGGGTGAGG- 

25 CACCAGGCCTGGCCTCAAAGAACAGATATTAAATATACAAATGAATATATGATTACAGC- 
CTGGAGTGGTGGCTCGTGCCTGTGOTTCCAACACTTTGGAAGGCGAAGGCGAGTA- 
CATTGCTTGAGCTCAGGAGCtAGAGACCAGCCTGGGCAACATGGTGAAAACCCGTC. 
TCTACAAAAAATGCAAAAATTAGCTGGGCGTGGTGGCGTGCACCTGTAGTCCCAGA- 
TACTCAGGAGGCTGAGGTGGGAGAATCACCTGGGCCTGGGAGGCAGAGGTTGCAAT- 

30 GGGCAGTGATTGTGCCACTGCACTCCAGCCTGGGCAACAGGAGTGAAAACCTATCT- 
CAAATGTG7GTGTGTGTGTGTGTGTGTGTGTGTGTGCGCACGTGTATAATCACAAGTA- 
CAAAAGTGCTGTGAAGGAAAACTTCAAGTCACCATAAAGATTGATTATGGGCTGGGTG- 
CAGTGGCTCATGCCTGTAATCCCAGCACTTTGGGAGGCCAAGGCAGATGGAT- 
CACGAGGTCAGGAGTTCAAGACCAGCCTGGTCAACATGGTGAAACCCTATCTCTAC- 

35 TAAAAAAAAAAAAAAAAAAAAAAAAGCCAGGCATAGTGGCATGCATCTGTAATCCCATC- 
TACTCGGGAGGCTAAAGCAGGAGAATTGCTTGAACCCAGGAGGCAGAAGTGAGCCAA- 
GATCACGCCACTGCACTCCAGCCTGCGTGACAGAGCAAGACTCCGTCCCAGAAAAA- 
GAAAAAAAAAAAAGACTTATTATGACAGGATGTCTACTGTCAAOTGTGGGGTGTGAGT- 
GTTGGCCAAGTGATCAGAGAAGGCTTCGTGGAAGAAGCGAGGTTTGAGTAGAGCCA- 

40 GAAAATAATTAGAAGAGATCAACCAGCAAGAGGGGATG6ATGAGAGAAGTGAGAAAG- 
GTGTTCCAGGGAGAGAGACCATCATACACAAAAGCTCTAGGCCAGAAGAAAGCT- 
GAGGCCTGTGAGTGCTGAAAGGAAGCCTGTGGGGGTGGAGCTCTGAGTTGAGCA- 
CAGG GAGCAGAGAAAGGGCAGCTGGAGGGGAAGGCAQGGGCAGATCGAAATCTCTT- 
TTTTAAATTAATTAATTCTTAATTTATT^ CC- 

45 CAGACTGGAGTACAGTGGCACAATCTCAGCGCACCGCAACCTCTGCCACCCAGGCT- 
• CAAGCAATTCTCTGGCCTCAGCCTCCCTAGTAGCTGGGATTACAGGTGGGCACCAO 
TACTGCCCAGCTAATTTTTATACTTTTAGTAGAAACGGGGTTTCACTATGTT6GCCAGG- 
CTGGCCTCAAACTCCTGACCTCAAAAGATCCACCCACTTCAGCCTCCCAAAGTGCTG- 
GGATTACAGGTGTGAGCCACCCTTCCCGGCTGTATTTTTGGAGACAGAGTCTTGCTCT- 

50 GTCCCAGCCTGGAGTATGGTGGTGTGAATTTGGCTCATTGCCACCTTGACCTCCAG- 

GGCTCAAGTGATCCT CCCACCTCA G CCTCC TGAGTAGCTGGGACTGCGGGTACACGA- 
CACCACGCCTGGTTAAI 1 1 1 I 1 1 1 AATTTTTTGTAGAGACGAGGGTATCTCACTATGTT- 
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gtccagqcrggttgaactcctgagctcaagcaattctcccacctcagcctcc- 
caaagtggtgggattacagacgtgagccactgtgcccggcttaatttatttacataa-- 
attttti j atgtttac 1 1 1 1 ctatctcctacaggaagaaaatatattttgttattgacag- 
* ggtctcgctatgttgcccaggctggtattgggctcaagccatcctgttccctcagcc- 

5 tcccaaagtactgggattacaagcgtgagcctctgcatccagcccagatccaaaatct- 
ttactgtcacctacagagtcctctgtaactagcttactgctcatcatgcccataccaac' 
ccaccttactgctctgatctcctcctctctctcccccagctcatt7tgtttcagctatg- 
ctggtctccttgctgtctctaaaacatAacaagcacatcccatctcagggcctttg- 
caccagctattttgtctgcctggaatgctgtttcccctgatagccatgtggctgaca- 

10 cactcacctccctcagctctttgctcaattgtcaacttctcggcccggcatggtggct- 
cacacctgtaatcctaccactttgggaggctgaggtgggcagatcacctgagatcag- 
gagttc6agaccagcctggccaagatggtgaaatcccgtctctactaaaaatacaaaa- 
attggcaaagcatggtagcacataccagtaatcctagctacccgggaggctgagg- 
caggagaa7tgctggaacccgggaggcagaggctgcagtgagccaagatcatgc- 

1 5 cactgtactccagcctgggtgacaaagc aagactctgtctcaaaaaaaaaaaagtctc- 
cttctcaatgagggcttcctgaccaccaaattaaatctacctcctagacacacacaca- 
cacgcacgcacgcacggacacacacacacgcacgcacgcacacacacacacagacar 
cacactatatcccctttccctgctttattgttcttgagagctcatttaaccatgtga- 

• catgctgaatattttacttatttattttgtttagaaagctcctggctgggcgcgggggc- 
20 tcacgcctgtaatcccagcac7ttgggaggctggaacaggtggatcatgtgaggt- 
caggagttccagaccagcctgaccaacacggtgaaacctcatctctattaaaaatg- 
caaaaattagctgggtgtggtgtcgcatgcctgtaatcccaactactcagaaggct- 
gaagcaggagaatcgcttgaacctgggaggcagaggttaacgctgagccgagatcg- 
cgccattgcactccagcctgggcaacaagagtgaaactctgtctcgaaaaaaa- 
25 caaaagtcagctccatggcaggagtgatggctcacgcctataatcccagcactttgt- 
gaggccgaggcgggcggatcacttgaggtcaggagttggagaccagcctggccaa- 
catggtgaaacctcatctctactaaaaatacaaaaattagccgggcgtggtgacacat- 
gtctgtagtcccagctacttgggaggctgaggctggagaatggcttgaacctgg- 
gaggtagaggttgcagtaagccaagatcgcgccattgctctccatcctgggcaaca- 
30 gactccgtctcagaaaggaagaaagaaggaaagagagaaagagagaaagaqacaga- 

GAGaGAGAGAGAAAGGGAGAAAGAGAGAAAGGATGGAAGGACCCTGACAAGCACT- 

gttgc ataaaag i i ic i m r ctctct cl 1 1 i h 1 1 h i m i i ml i m i gagacagggtc- 
tcacttctgttgctccagctgaagtgcagtggtgagaacatggctcagtgcagcct- 
caacttcccaggcttaagtgatcctgccacctcagcctcctgagtagctgggactg- 

35 taggtgtgcaccaccgtgcctagctaattttttgtatttttagtagagacatggttccg- 
ccacgttgcccaggctggtcttgaactcctgggcttaagggatctgcccgccatggc- 
ctcccaaagtgctgggattaccagogtgagccactgtacccagcctgagtataggtt- 
tctgataaattttagqatcatattgtttggactgggtaagaatttccagaactctaat* 
gaagaaactgactggtttatattttattttattt^ 

40 tcactcttgttgcccaagctggattgcagtggcacgatcttggctcaccacaacctc- 
cgcctcccggtttcaagtgattctcctgcctcagcctccccaggagctgggatta- 
^ caggcacccaccaccatgctcggct a i itttttttttatttttttattt1 la qtaqar 

w gacggggtttcaccatgttggccaggc3tggtctcgaactcctgacctcaggtgatc. 

cacctgccttggcctcccaaagcgctgggattacaggcatgagccactgtgcaaggc- 

45 ctaggctggtttataaaattgctaaaccaagcagaacatgaattaaataccaaggaa- 
atactctcctagattgtcatgttacatcagccaatactaaaa7tgtgaagatacacaat- 
ttgaatgaactccatggtccaagtcgaattatctatgatattacccatctaataaacag- 
cactatgtcccttaatgggagaaaaagttggagaatttaagagaatatcaatccaat- 
gttggttgggtgcagtgaatcatgtctatattcccagcactttgggaggccaagg- 

50 caggaggatcacttgagcccaggaattcaaggccagcctcggcaacacggtgagatc- 
ctgtctctacggaaaattaaaaaaaaaaaaagagagagattagtgggatgtggtgcc- 
tatagtcccagctacttgggaggctgaggcgggaggatcatttaagcctgggacgtt- 
gaggttgcagtgaaccatgagtgagactcatctcaaaaaaaaaaaaaaaatggcgat- 
cactagaggaaaaaaaaactaaagtggggtttgcgggtagtgggagggcccttcctg- 

.55 ctaggttgcactatgatctccagggaggctccacgggagaatcatttccrrgt c l h i - 
tcagtttctagagccaaatfctttgcataccttgcattccttggctcggaaccccttcc- 
ctaaccttcaaagctggcagctagcctctggctcaagtgtcacatggcctgtctct- 
gtcttcctatccaatcttcctc7tataagaacattggagccaggcatggtggctgacg- 



53 




29/04 2003 12:27 FAX 3332^| p HO I BERG A/S ¥ @054 

P 6B7 DK03 

51 

CCTGTAATCCCAGCACTTTGGGAGACCOAGGCAGGCGGATCACAAGGTCAGGAGT- 
TCGAGACCAGCCTGGCCAACACAGTGAAACCCCGTCTCTACTAAAAAAATA- 
CAAAAAAGTAGCCGGGCATGGTGGCAGGTGCCTGTAATCCCAGCTACTTGAGAGGCT- 
GAGGCAGGAGAATCGCTTGAACCTGGGAGGCAGAGCTTGCAGTGAGCCGAGATAGT- 
5 GCCAATGCAGTCCGGCCTGGGCGAAACAGCGAGACTCCGTCGCAAAAAAAAAAAA- 
ATAATAATAAATAATAAATAAAAATAAAAATAAAATAAAAAAATAAAAATAATAAAATAA- 
ATAAAAATTATTTTGAGACAAAGTCTATTCTGTGGCAGAGGCTGGAATGCAGTGGCGT'- 
GATCACAGCTTACTGCAGCTTCTACCTCCTGAGCTCAAGCGATCXTrTCCACCTTGGCT- 
TCCTGAGTAGCTGGGACCTCAGGTGTACAWACCACGCTCAGCTAATTAtTTATTTATT. 

1 0 TATTAT AI i I I I 'OTGACGGAGTTTCGCTCTTGTTGCCCGGGCTGGAGTGCAATGGTGC- 
TATCTCAGCTCACTGCAACCTCTGCCTCCTGGATTCCAGTGATTCTCCTGTCTCAGCT- 
TCCTGAGTAQCTGGGATTACAGGTACATGCCATCACGCCCAGCTA A I 1 1 1 IGTATTTT- 
TAGTAGAGACGGGGTTTCATCATATTGGTCAGGCTGGTCTCGAACTCCTGACCTCAG- 
GTGATCCACCTGCCTTGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGGCACCACG- 

1 5 CCCGGCAA TfT I H H I T C I I I M J I I 111 I CAGACAQAGTCTTGCTCTGTCACCCAGGC- 
TGGAGTGCAGTAGCGTGATCTCGGTTTACTGCAACCTCCATCTCCCGGGTTCAAG- 
CGATTCTCCTTTCTCAGCCTCCCAAGTAGCTGGGACTACAGGTGCACACCACCACGG- 
CGGGCTA AI 1 1 1 IG T AI \ \ 1 tA GTAGACACCAGGTTTCACCATATTQQTCAGACTGGTO 
TCAAACTCCTGACCTCAGGTGATCCATCTGCCTCAGCCTCCCAAATTGCTGGGATTA- 

20 CAAGCQTGAGCCACACACCTGGCTTA A I nTTTTATTn \ GATCG ACACAGGOTCTCCC- 
TATGTTGTCCAAGCTGGCAGAG A I rTTTGTl 1 G TTTGTTTGAGAGGGAATTTTGCTCTT- 
GTAGCCCAGGCTGGAGTACAATGGTGCAATCTTGGCTCACCACAACTTCCGCCTCC- 
CGGGTTTAACAGATTCTCCTGCCTCAGCCTCCCAAGTAGCTGGAACTACAGGCACC- 
TACCACCACACCAGGCTAATTTTTGTGCTTITTAGTAGAGATGAGGTTTCACCATGTT- 

25 GGCCAGGCTGGTCTTAAACTCCTGGCCTCCAGTGATCCACCCGCCTTGACCTCC- 

CAAAGTGCTGAAATTACAGGCGTGAGCACCGCGCCTGGCCTCTCAACCTACAATTT- 
CAACACCCAAGGAAACAGCCCACCATGAGTGAGAACCAGCAGACACAACAAACTA- 
TAGGATTAGCTGCCTCCAAACTTCAGGTGATAGATTATCAGGCATGTACTTGAAAC- 
TAAAGGACACAAAAGAAGAATCCGAAATATAAAATAAAGGATTGGACTTGTGTGAAAA- 

30 GAATCCCTTAGAAAGG6CTACTTTCAGGCTGGCCATGGTGGCTAATGGCCTGTAATC- 
CCAGCACTTTGGAAGGCCGAGGTGTGTGGATCACCTGAGGTCAAGAGTTCAAGAC- 
GAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTGAAAATAGAAAAATTAGCCAGGT- 
GGGGTGGCAGATGCCTGTAATCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATCGC- 
TTGAACTCAGGAGGCAGAGGTTGCAGTGAGCTGAGATTGCGCTATCGTGCCCCAGCC- 

35 TGGGCACTAGAGTGAGATCAAAAAAAAAAAAAAAAAAAGAAGAAGAAGAAGAAAGGGC- 
TACTTTCAGACTGCCTTGCCAAAAATCATAACCACAATGATGAGCATGTA7TGAGT- 
CAAAACAGAATC AAAAG AGAAGAAAGTCAATTTCTGTGCAAACTACTTTTATTTATAAG« 
GAAAGTTTCTCTATTTTGTTTATAAACATTAAACCAGTGCTGTGTGAAGGCACTTAAT7- 
GGGGAGAGGTGGGGCAGGGATCCTGGTAGAGACCAATGTTTCCCACCCAGACCC- 

40 CAAGACTGCTGGGAGAGATGGTGTCAGCAGTGACTCCCAGGAATATCCAGTGGTGTG- 
GTGGCCCATCOCAGGCCCGGCTGGGCAGGTGGCTGGCTTGCTGGGGGATGTGAT- 
GATGGTGGTAGGCATGGGAGGCACTTTGGACGGGATCTGATTTGGCAAAAGGAAGTG- 
GTTTCCTGTCCCC3AGTGATTTCCAGCCCTTCCCAGACCTCCCAAGGCTAAGGCAGAT- 
TACTAAATTTAAGGCTGGGGCCCTCCTTCTTCCCTGGACTTCCAGGAGAACAGAGAAC- 

45 CGGTGGCAAGGACCACCACCAGCAGGGTGAGGGGTGCAGATAAAGGCAGCAAAAAA- 
CAGAGGGAGAGGTCTGGAGGGAAGGCAGGAATGCTTGTTTCTGTCAGCCTCAGAAAC- 
CTCCTTCTATCCTGCTAGACTTTACTCCTTTGAGGCTTCACCCTGGGGAACAGCTGGG- 
GAGAGACAGGATCTTCAGACATCAGGAGCTCCCACCTCCTCATCCCACATGCAAATC- 
CGCTGCCTGTCTCTATCCTCCCACCCCTTCCTAAGGGGACCTCTCAGCACCTCC- 

50 CAAACTGCTCCAGAATCCAAGTTCTGTGTCACCTCCAAGAACCAGATGGAACCTTCCA- 
ATCAGAGCCTCCACTGATGAAATGGAATATTTCCAGTGTCTCCTAACTGCCATAAGGA- 
GAAGCCCACCTCTCTCTAACACCTTGGTTGTC t \ 1 I 1 G GGTCCC ACCTCCATATT- • 
TAAAAAATCTCCTCTCTCAGGGCCGGGAGCAGYGGGTCACACCTATAATCCCAGCAGT- 
TTGGGAGGCCGAGGTGGGTGGATGACCTGAGCTCAGGAGTTCAAGACAAGCCTGGT- 

55 CAACATGACGAGACCCTGTCTCTACTAAAAACACAAAAAATTAGCTGGGCGTGGTGGT- 
GCATGCCCGTAATCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATCACTTGAATCX^G- 
GGAGGTGGAGGCTGCAGTGAGCCAAGATCGCGCCACTGCACTCCAGCCTGGG- 
CGACGCAGCTGAAGCTGTGTCTCCAAAAACAAAACACACACACACACACACACA- 
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GAAAAAAAAAACCAAAATAAAAAAATCTCCCTTCTCAGGAATGTAACGGAATCTTCCTT- 
GCCTTCTC^CCTMCCCTAATAGAGAATTTTCCTCAGTTACACTGTAATTTTATTAATG- 
G AI n r TCCTCATrCTGCCCAATGCAGTGTAATGAAAGCTTCCTCTCCATCTGTTATAT- 
TATATATAAATATATATTATATAm 

TATTGTCACCCAGGCTGGAGTGCAGTGGCACCATCAGGGCTCACTGCAGGATCAATC- 

TCCCAGGCTTAAGCGATTCTCCTGT GTCAGCCTCC TG ATGAG CTGGGATTACAGG> 

CACCCGCCACC AC ACCCGGCT AACTTTTTTTTTTTGTATTTTTAGTAG AGATGGAGTTT- 

CACCATGTTGGCCAGGCTGGTCTAGAACTCCTGACCTCAGGAGATCCGCCCGCCTT* 

GGCCTCCCAAAGTGCTGGGATTACAGGTGTGAGCCACCTGGCCGGGCCCTCCACT. 

TCCTITCTTGTACATTGCTGAATCCCTGTGTCAGCCCTAGAGGTCCAGTCTTTTGCCCTC- 

TCCCAGCCTTAATCTACAATTCTGTAACCCACCCACCATC^TTAAAATGAGATTCTTCT- 

TTGTCGCTTCCCTTGGCTAAAATGGATTATTCTTTAACCTCTCCACCAATACAACCAGG- 

GATGATAATAAAAACATTGGATTGAGCAGAAACCAATCAAATAACTAGTAAGGCAGTAG- 

TGGCGAGCACCCTACATCCTGACAGCTTTATAAAGGGCGCTTCCAGCCAGGTGCGGT- ( 

GGCACATGCCTGTAATCCCAGGACTTTGGGAGGCTGAGGCGGGCAGGTCACCTGAG- 

GTCAGGAGTTCAAGACCAGCCTGGCCAACGTGATGAAACCCTGTCTACACAAAATAr 

CAAAAAAAAAAAAAAAATTAGCCGTGCGTGGTGGCATGCGCCTGTCATCCCAGCTAC- 

TCTGGAGGCCAAGGAGGGAGGATCACTTGAGCCCGGGAGGCAGAGGTTGCAGTGAG- 

CCCACATCTTATCACTGCACTCCAGTCTGGGTGACAAAGCAAGACTCCATCTCAA- 

ATAAATAAATAGAAATTGGCCGGGTGCGGTGGCTCATGCCTGTAATCCCAGCACTTTG- 

GGAGACCAAGGCAGGTGGATCATTTGAGGTCAGTAGATCAAAACCAGCCTGGCCAA- 

CATGGTGAAACCCCGTCTCTACTAAAAATACAAAAAGTAGCCGGGCGTGGTG6TGGT- 

GGGCGCCTGTAATCCCAGGCAGGAGAACTGGTTGAGCCCGGGTGGGGGGGGCC- 

CGAGGTTGCAGTGAGCACAGATGGCGCCATTGCACTCCAGCCTGGGCGACAGAG- 

CGAGACTCCGTTTGAGAAATAAATAAATAAAATAAAAATAAAAATAAAAAAATAATAGAA- 

ATTTAAAAATAAAATAAAGGGCTTTTCCTCACCTACTCCACTAACTATAAGGGACCCT- 

TACCCCCGACATTACTATTAAATATAACGGACTTTTCGTCTCCTCCCCATGAGCAATA- 

ATGAGCTTTTCAGACCTCCCTCTCCCAATATAACGGTTTGTTCCTGTTGCCTCTTCTTT- 

TTCCTGTGGGATCCCCCTTrrCCCCAACCCCCAACTGTCGGGAGGTCCCCATGACTTC- 

TCCCCTGGGCTCACCCCGAAGTAGTTCCGCGGCACGTAGCCCTCCTGGCCGTGCAG- 

CGCGGCCCACCACCAGTCGGTCTCCTCCGGCCCGTCCCTCCGCAGCACGGTGAO 

CGACTCGCCCTCGCGGAAGGACAGCTCGTCCCCGAACTCGGCGCTGTAGTCCCAGA- 

GAGCGTACACTGCCCCGCTGTTCATCAGCCCCATACTCTGCTCGACGTCTGAAACAT- 

GCCACGGAGGGGAAGGTGAGAGCCTGGCCCAGGGGGTCCAGGAACAGGGGC- 

CACGTGGGGTCCAGGACAGACCCTGGAATTTGGCGCCTGTCCCAGCAACCACCTGAA- 

ATGTTGTGTGTGCCCATGGCTGTGGATGGGAACCGGAGCTGGAGTCAGATGCCGG- 

GACTGGCCGTCTTTGAGCGTTCGAGGAAACTGGGGGAGGCATGCCAGTGGGCCACC- 

CACTCCCGAGGCAGGGTCAGAGGCTCCCATTTCTTTTCTTT CI rTTTTTTTTTTTTTTG Ar 

GACAGAGTCTCGCTCTGTCGCCCAGGCTGGAGTGCAGTGGCACGATCTCG6CTCAC- 

TGCAACCTCCGCCTCCCGGG7TCACACCATTCTCCTGCCTCAGCCTCCCGAGTAGCT- 

QGGACTACAGGCGCCCGCCACCACGCCTGGCTA A TT 1 1 IG GT AI I I I i AGTAOAGT- 

CAGGGTTTCACCGTGTTAGCCAGGATGGTCTCGATCTCCTGACCTTGTGATCCGCC- 

CACATTGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGCC 

I f 1 1 1 1 1 J I I I I 1 1 1 I 1 1 I I GAGATGGAATTTCGCTCTTGTCGCCCAGGCAGGAGTGCA- 
ATGGTGCGGTCTCACTGCAACCTCCGCCTCCGGAGTTCGAGCCATTCTCCTGCCT^ 
CAGCCTTCGAAGTAGCTGGGATTACAGGTGTGCGCCACCATGCCTGGCCAAI J I 1 1 G- 
TATCnTrTAGTAGAGACGGGGnrrrCACCATGTTGGTCAGGCTGGTATCAAACTCCTGAC- 
CTCAAGTGATCCACCCGCCTCGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGC- . 
CACCTGGCCCGGCCCTCATTTCCTTCTTGTACATrGCTGAATGCCCGTGTCAACCCTA- 
GAGGTCCAGTCTTTTGCCCTACCCTGGCGCTTAGCTTAAGTGGTACAGTCTCTAAG- 
GAAGATTCGCACCTTCCTTGAATGATAGGGTCCTTTAAGTTGGCTCATCTGCCTCTTTC- 

I I net 1 1 i crrrr cTi rrcrm i g gagacggagtcttgctctgtcgcccaqqctq- 
gagtgcagtggcgcgatttcggctcactgcaacctccgcctcctgggttccagcaat- 

TCTCCTGCCTCAGCCTCCAAAGTAGCTGGGACTACAGGCCCACGCCGCTACACCCGG- 
CTAAATTGTTTTATA I \ I I I A ATAGAGACGGGGTrTCACCGTGTTGCCCAGGCTGGTTT- 
GGAAATCCTGAGCTCATGCAATCCGCCCGCCTCGAGCCTCCCAAAGTG CTAGG ATTA- 
CAGGCATGAGCCACCGCGCCTGGCTTTCr I T T1 CI I ITCTTTTCTTTTTTTm I CAG A- 
C^GGTCTCACTCTGCCACCCAGGCTGCGGGAGTGCAGTGGTGAGATCAAGCTTACT- 
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GCAGCCTCGAACTTCCAGATTCAAGCAATCCTCCXrGCCTCAGCCTCCTCCTGATTCTT- 
TATGTTATTATTAAATATTTTGTAGGCCGGGCACAGTGGCTCACACCTATAATCACAG* 
CACTTTGGGAGGCCAAGGCAGGCGGATCCTCTGAGGTCAGGGGTTTGAGACCAGCC- 
TGGCCAACATGGCAAAACCCCGTCTCTACTAAAAATACAAAAAAAAAAAAAAAAAAAGT- 
TAGCGGGCCGTGGGGCCGTTGCCTGTAATCCCAGTTACTCGGGAGCCTGAGGCAG- 
GAGAATCGCTTTCACCGAGGAGGCAGAGGTTGTAGTGGGCTATGGTGCCATTGCAC- 
TCCAGCCTGGGTGACAGAGCAAGACTCTGTCTCAAAAAATAAATAAATAAAAATAAr 
ATAAATATTTCGTAGAGGTCAGGTGTGGTGGCTCACACCTGAATCTTAGCACTTTGG- 
GAGGCCAAGGTGGGCAGATTGCCTGAGCTCAAGAGTTCGGGACCAGCCTGGGCAA- 
CACTGCAAAACCCCTTCTGTACTAAAAATACAAAAAAATGAGTCGGGCATGGTGGT- 
GAGCACCTGTAGTCCCAGCTACTCAAGAGGCTGAGGCAGAGAATTGCTTGAATCCAG- 
GAGGTGGAGGTTGCAGTGAGCCGAGATTGAGCCACTGCACTCCAGCCTGGGTGA- 
CAGTGAGACTCTGTCTCAAAAATAATAATAAATAAATATTTGTAGAGACAGGGGGTCTO 
TACAATGTCTTGTAGCCTGACCAGGCTCACCTTTCAAATATATAACCCTCTGTCTCACC- 
CATAAGTCCTAGGACCTGCCTCACTCCAACTCTCCGTGAAGTTCGTTGCCCACACCGA- 
GATACAACTGGCTCCTCCAGGTGTGAAATGACCCTGTGCACAATCCCCGTGGCACAG- 
CCTACTTCGCCCTGCCCGTCGGGGAACCAGGTGATGTAGCCTGCCCCCTGGAGAGA- 
TAGGGTACAGCCTTGTGTCTTCCTACAAGCCCCTTTCTGGCAGCTGTAGCCTGCTCAC- 
CTGCCAGTGGTGTGGCAATGCCTCTCCCACAAGTGGCAGAGCCCACCTGCCCAGAG- 
CCCTATGCCAGGTAGATGGCAGGGTTGAAACGTTCAGCTCCTCACCCTTGAAGATGT- 
* GAAAGGTGAGCAGACCAATCTTCACAGCCACTCTCCTCCCCAAAGGTGTCCAGCTCG- 
CATAGCACAGCCTCCATGTCCCCTTTTTCCTTAGGAGGGCATAGTCCCCCCACCCC- 
CGCAAGCGGTCCATCCCTCATCCTCCTCCTCGGCAATCCTGCCAAGTGGTTGGTA- 
CAGCCCCCATACCCTTCTCTCCCTAGTAGGGGGTAGTTGCTCCCCTCCCCGCTCCTG- 
CGCACCCGCCAGGTACCCAGGCGCCAGCAGCCCTGCCTCGCACCTGCCAGGTAGGT. 
GGCGCAGTCAGCATAACCCTCGCGGTAAGGGTCGCACTTCTCGAAGGCGGTGGCGC- 
CGTCGCTGAGCGTGGTGGCGAAGA7TGCAGCGCCGTGCTGCACCAGCGCCATGCA- 
GATGACTGTGTCGTrGCACGACGCCGCGCAGTGGAAGGGTGTCCTAGGCGTGGGG- 
GTGGGGGGTTGCGGGGAACGATGCGTGAGAGGCTGCGCGTCCGCCCACGGGGGAO 
CCAGCCCACCGCGCGGGTCGGGGCTCACCAGCCGTGGCTGTCGGGGGAGTTGA- 
CATTGGCACCCGCGGTGATGAGGAAATCCACGATAGAGTAGTTGGCGCCGCAGAT- 
GGCGTTGTGCAAGGCAGTGATGCCCTCCTCGTTGGGCTGGCTCGGGTCGTTCATCT- 
GAGTGCACCG6GGGAGGGGGAAGACTGAGTCCCGCGGCTGGCATCTGCGATGCCC- 
CCGCCGTGCCCACCTCCCGCTCAGCAGCGCTCACCTCCTTCACCGCCTGCTGCAC- 
CACCTCCAGCTCCCCGGTCAGCGCCGCGTCCAGGAGGAGCACCAGAGGGTTGAGG- 
CGCGCGCGGCGGGCCTTGCGCGGGGAGCCCGCCTTCCGCAGCACAGAGCGCATC- 
7CCTGGGGGACAGGGCGCAGAGGTCAGGGAGTTGGAGGGATTG7TAGTATATCCAT- 
GATCTAGAGTAGGAAACAGAGGTCCAGGGACTTGTGGCACCCATCTAGACAGGGGTA- 
GAACTGGGATTCCCTCGGGATGGGGTGAGGGGGTGCCTTCGATCTCCTCCTAGAGCC- 
TCCAGTTCCCTGCCATAGACAGGGAATCCTGTGATTrGAGAATCTTGGGCCCTGAAAC- 
TTGGGAGAAAGCTGGGGGGCCATGGGATTGGTGGCAAAGTAATTCTATCAGTT- 
CAAAACAATGATTGTGGAAGCCAGTTATGCAATTCACACACAGTCTGACATTTCTTTT- 
GTTAATAATGAATGCAATGAGACACACATGACAAAATGTTACCAGGAGTGTTCATTC- 
CGGATGTTTGGAATTTGAGGATTTTATTATTCCTTGTATTTTC O I ITTCTTn I CTCTTT- 
I 1 1 I J I I 1 1 1 I I UGAGATGGAGtCTCGCTCTGTCACCCAGGCTGGAGTGCAGTG- 
CAGTGGTGTGATCTCAGCTCACTGCACCCTCCATCCCCCAGGTTCAAGCAATTCTCCT- 
GCCTCAGCCTCCTGAGTAGCTAGGATTACAGGCATGCGCCACTATGCCTGGCTAATTr- 
TCATATTrTTAGTAGAGACAGGGTTTTGTCATGTTGTCCAGGCTGGTCTCGAACTCCT- 
GACCTCAGGTGATCCACCCACCTCAGCCTCCCAAAGTGCTAGGATTACAGGTGTGAG- 
CCACTGTGCCCAGCCTCATGGQCTTTCTT A 1 1 \ \ I AATTTTCCTCCTGTAAGATTCATT- 
TATTCTGGGCTGGGCGAGGTGGCTCATGTCTGTAATCCTAGCACTTTGGGAGGCT- 
GAGGTGGGAGGATCACTTGAGCCCAGGAGTTCGAGAACAGCTTGGGCAATATAGTGA^ 
GACCCAGTCTCTACAAAAAATAAAAAATTAGCCTGACATGGTGGCGCACACC- 
CGTCGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGGATTACTTGAATGGAAGAGAAG- 
GAGGCTTCAGTGAGCCATGATCATGCCACTGCACTCTAGCCTGGGCAACAGAGTGA- 
GAdCCAGTCTCAAAAGAAAAAAAAATGCATTTATTTATTCCAAGTGTGTGAGTGCATAG- 
CATTTGTGATTCTGGTCTTTGCTGTTTCCAGAGTTTCAGTGATTrTAAGATT 
CAGAGATCCCAACAGCCACTGAATTCAAAATTCCCAGATGCTCAGTTATTTCAAGTTTC- 
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caatatgttgtgattgcagaaatgctaggctgtgctatttca 
caggactttggaatccaaagattctatgatggagaactttaatatttrtct 
tci tihii r gttggtttttttqaqacagagtctcgctctgtcgcccaggctggagtg- 
cagtggtgcgatctcagctcactgcaagctccgcctcccgggttcaggccattctcc- 

5 tgcx:tcagcctgccaagtagctgggactacgggcgcccgccaccacgcctggctatt- 
ttgt a i i i i ia gtaaagatggggtttcaccgtgttagccaggaaggtcttgttctcct- 
gacctcgtgatccgcccacctcggcctcccaaa^tgctgggattacaggtgtgagc- 
catcatgcctgacctagaatttcattttaaaagactagaaggaaatggctgggtgcg- 
gtggctcatgtgtgtaatctcagcactttgggaggctgaggagagtggatcacct- 

1 0 gaggtcaggcaggagttcaagaccagcctggccaacgtggtgaaaccctgtctctac- 
taaaaatacaaaaattaggtggccgtggtggtgcacgcctgtaatcccagctactcag- 
gaggccgtggcatgagaatcacttgaacccaggaggcacagttatagtgagctga- 
gatggcaccatcgcactccagcctgggtgacagagtgagactccatctcaaaaaag- 
gaaaaaaaaaagaaagactagaaggaaatattcaaaatgttaatgatggttccctgt- 

1 5 gagtggtgtgattttgtcctc i i ici tct a i » 1 1 1 atttattttcgccaagctctctatg- 
gtgttggtgtatttctctatagtggaatgtgtaaatttaaagtataaatctcagctggg- 
cacagtggctgatgcctggtttgagaccagcctggacaacataatgagaactgtctc- 
tactgaaaatgttaaatattatctgggagtggtggtgcatgcctgtagtcccagc- 
cataggggaggctgaggcatgaggatcaattgagcccagtaggtggaggctgcagt- 

20 gagccatgatcttgccactgcactccagcctgggcaacagagtgagactctgtc- 
tcgataataataaccctctattacaacatatcagtgcatgaatttgt.gattttataatt- 
caaaatatgagcatctttaattgtcagatttggtgacttcaagaatcagtaataat- 
cagtctatgatactaactttataatt ai 1 n 1 i 1 1 a agagaag agtttcxttttattt- 
tattttatttgagacagagtttctctctgttgcccaggctggagtgcagtggcgca- 

25 atctcggctcactgcagcctctgtctcctaggttcaagcaattctcctgcctgagcc- 
tcccgagtagctgggattacaggcatgcaccaccaggcccagctaai 1 1 i igtatttt- 
tagcagagacggggtttcaccatgttggcgaggctagtcttgaactcctgacct- 
caagtgatccacccgcctcggcctcccaaggtgctgggattacaggcatgagccac- 
cgtgcccagcctaactttataattctaagatcgtgttcaaacctttaaatgctctaggg- 

30 ctctaaaatgttactatcctaagacggtgacactagcgtttgattcttacattctatgat- 
tttttaagtttctctgtggccaggactctgtgattctacaatgggatgctcagccattt- 
caacatgttgttattcatcccctcttgatttcaaaatcctgagcctcaaggttccttgc- 
ctttactttcaggagggcctaggaataggcattttgggggggtccacctgacccctg- 
cttctctgagaagtgatctcttcccgctgtctacgcacacggagtgttcaggactgt- 

35 tccatgtggctacaaccctcttcccagtcaagatgcagggaccaagatcagcagga- 
gaccatcccctggtccaatggtgacaacagtaagagcagttaacagttatgtgccagg- 
tattatgctaagcactacattaatgtatttaatcttggcggggtgtggtggctcacacc- 
tgtaatcccagcactttgggaggccagggcgggcagatcacttgaggtcaggagtt- 
caagaccagcctagccaacacagjgaaaccccatctctactaaaaatacaaaaattag- 

40 ccaagcgtggtggcatatgcctgtaatcccagccacttgggagactgacgcaggaga- 
atcacttraacccaggaggtggagtccagcacccagccgagactcacttgtltttatt- 
tatttattt attt a 1 1 1 1 1 am fttal 1 11 1 ii i gagacggaatcttgctct6tcacc- 
caggctggagtgcagtggcgcgatctcagctcaccacaagctccqcctcccgggct- 
cacgccattctcctctcagcctccagagtagctgggactacaggcgcccgccaccac- 

45 ccccagcta ai 1 1 i i gt ai i 1 1 i a gtagaqacgggqtttcaccgtgttagccaggatg* 
gtcttatctcctgacttcgtgatccgcccgcctcggcctcccaaaatgctggga7ta- 
caggcatgaaccaccacecccggcctatttatttatttatttagagatggagtcttgc- 
tctgtcgcccaggctggagtgcagtggtgcagtcttggctcactgcaacctccgcct- 
tccgggtttaagcgattctcttgcctcagcctcctgagtagctgggattggaatga- 

50 gaccaccacttctcctgttgtcotcc^^ 

ttataagacaggaaaaaaagggagaaagcaaaacgctggaaaaaaacagaagtacga- 
taaatagctagatgaccttggcgccaccatctggtcctggtggttaaaataataata- 
ataatattaatccctgaccaaaactactggtgttatctgtaaattccagagattgtat- 
gagaaagcactgtaaaacgttttgttctgttagctgatgtctgtagcccccagt* 

55 cacgttcctcacgcttacttgatctatcgtggccctttcacgtggaccccttagcgtt- 
gtaagcccttaaaagtgctagga a 1 1 i ci i m i c ggggagctcggctcttaagacgct- 
gatgctcccggccgaataaaaacctcttccttctttaatccggtgtctgaggagtttt- 
gtctgtggctcgtcctgctacagaattacaggcacgcgccaccgctccgggctaatt- 
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TTTGT AI HIM 1 A GTAGACAGGGGGTTTCACCATGTTGGTCAQGCTGGACTT6AACC- 

TCTGACCTCATGATCCACCCACCTCGGCC TCCCA AAGTGCTGGGATTACAGGCGT- 

GAGCCACCGCGCCCGGCCGAGACTCACTATTTTATAAGAGGAGAGAGCAAAGCCAG- 

GAACAGTGGCTCATGCCTCTAACTGCAGCAATTTGGGAGGCTGAGGCAGGTGGAT- 

CATTTGAAGTCAGGAGTTTGAGACCAGCCTGGGCAGCATGGTGAAACCTCATCTCTAC- 

TAAAAATACAAAAATTAGCCAGGAGTGGTGGCATACACTTATAATCCCAGCTACTTGG- 

GAAGCTAAAGCGGGAGGATGGCTTGAACCTGGGAGGCGGAGGTTGCAGTGAGC- 

CGAGGTCAAGCCACTGCACTCCAGCCTGAGTGATGGAGCAAGACTCTGCCTG- 

GAAAAAAAAAAAAAATAGAGGAGAGAGCAGAGCAGACACAAGAGACACAGAGAGAGA- 

GAGGGAGAGAAGAGAGGGTGACTGCTTTGATTCAGGCAAGACTTCTCAGTCCCAGA- 

ATGAACCCACTGTTGTGCCAAGACTCAGTCATGTCCAGGTGTATGACTCGAGATTGCT- 

GAAGGAATGCCCGGGGCAGGGCACAGGCACAGGTTATTGGAGAGAAGGAGCAGA- 

GAACATCTCTATGTGGCCAAGACTCCCAGATGGCCCTCCATATAGTCACACACAGC- 

TATCCTAAAGACTACATTTCCCAGCATCCCATTGCAATGAGGCTCCTGGCCAGTGG- 

GAGCAGGCAGAGTGATGTATGGAACTGCCAGGTTCTGCCTGAAACAGGAAAGGGGAC- 

TrrCTCTTCTTCTTTCTCTCTTCCTGGCTGGAGGGCAGACTTGGTGACAGCCATGTAG- 

GACCATGAAGGCAGGCTTACTCCCCGATGGATGGCAGAGCCCCAGGTAGATAGAGCC- 

TGGGTCCTGACTCCAGTGAGGTGCCTACAGTGCTGGGCTGCAAACTCTTGGACTTC- 

TACTCAAAAGAGGAGAAAACTTCGATCTCATCTAAGCCACTATATTTGGGGGGCTCT- 

TTGCTACAGCTCCTGGATTCATGTAGCAAACATACCCCGGTTTCCTCCTGTATTACT- 

TACCATGCTCTGCGGCTGCTCTGGTGGGCTGCTCTGGGACGGGGCCGGGGGTGGA- 

ATGGGAGCTGGTGGGGCAGGAGCAGGGGGCCCTGCCCTGGCCTCAGATCCCTCAGT. 

GATGGGGGACAGCTCTGGCTOCGGCCCCCCGGGCCCTGGCCCCCCATGACGATG- 

GAAGAGGCGGCTGATGATCTGCTGGTACTGI I ICI I GTGGGTAGGGGGCAGGGCCA- 

CAGCAGGGGCCTGCTCCATGGAGCCCCTGCGTTTGAGGGGCCGGGGAATTTCCGC- 

CAACACCCGTGCCACCTCCTCCAGCTCGGGCACCGACTGTGCCTCCGGTGGCAGTG- 

CTGGCTGCAGCCTCGTGGGGCTGAGAGGCCTTGCTACAGGGCCTTCATCCACATCG- 

CCAGCCTCCAGCACTGGTGTCAGCAGCCCCTCTATCTCCGGCTCAGGCTCCAGCTCG- 

GTGGGGGGTTTGGGGGGTCCTAGCCGGAACAAGAGCCCATCAGAGGACAGGTCCC- 

CAGGAGACACCCAACACTCCCTCTCCACAACTTCCAGGGCATACAACCAGCACATGAT- 

TTTCTGTGTGACCTCAGGGAAGTTCCTTGCCCTCTCTGGGCTACACTTTCCTTGGGCT- 

GTGAATAATATACAATTATGATGCCTCCCATTTATTGAGCAGTTAGTATGTGCCTGGCG- 

CTTTACATGCCTACCTTATTGTAATCTCACCACTGCTTTGTGAGGTAGATACACTGC- 

CATCTCCACATTACCGAAAGGGAATCTGGGCCTCAGAGAGGACAAGTCAGTTGCC- 

CAAAGCCATGCAGTTGGGACTTGAACTCAGTTCTGGCTGACTCTAGAATCTACTTC- 

TACCAACCGTGATAGATGTGATTTTCTGAGATCCTGAGAGTTTCCTCTCCTAACATCT- 

CAGGCAGAAAACTCCAGCAGGAAGTAGAATCCTGGTGTrTAATGATTTCTTCTCTGTCT- 

TACTCATTCTGACAGTAAAGCAGGTGGAAATAAAAATATGCATTATTGGCT- 

GAGTCGAGTGGCTCACACCTGTAATCCCAGAACTTTGGGAGGCCGAGGCAGGCA- 

GATCTCTTGAGATCAGGAGTTTGAGACCAGCCTGGCCAACATGGTAAAACCCTGTCTC- 

TACTAAAAATACAAAAAAAAAAAAAAAAAAAAAAAAATTAGCTGGGCGTGGTGGCACAT- 

GCCTGTAATCCCAGCTACTCGGAAGGCTGAGGCACAGGAATCGCTTGAACCCAG- 

GAGGCGGAGGTTGCAGTGAGCCGAGATTGCACCACTGCACCACTGCACTCCAGCCT- 

GGGCAAAAGAGTGAGATTTCATCTCAAAATATATATATATACACACACACACACAAACA- 

CACACACACATTATATATATAGTGTATATATATTTTTATATAGTATGCATATACATATAA- 

ATAATACACACACACACACACGGCTGAGCATGGTGGCTCATGCCTGTAATCCCAGCAC- 

TTTGGGAGGCTGAGGTGGGTGGATCACCTGAGGTCAGGGGTTCGAGACCAGCCTGG- 

CCAACATGGCAAAACCTCATCTCTACTAAAAACACAAAAAATTAGTTGGGTGTGGTG- 

GTGCATGCCTGTAACCCCAGCTACTTGGGAAGCTGAGGTAGGAGAATCGCTTGAACC- 

TGGGAGGTGTAGGATGCAGTGAGCTGAAACCTCACCACTGCATTCCAGCCTGGGCAA- 

GAAGAGTGAAACTCCATCTTGGCTGGGCACGGTGGTTCACGCCTGTAATCCCAGCAC 

TTTGGGAGGCCGAGGTGGGCAGATCATGAGGTCAGGAGATCGAGACCATCCTGGC- 

TAACATGATGAAACCCCGTCTCTACTAAAAATACAAAAATTAGCTGGGGGTGGTGGTG- 

GGCGCCTGTAGTCCCAGCCACTCGGGAGGCTGAGGCAGGAGAATGGCGTGAACCCG- 

GGAGGCGGAGCTTGCAGTGAGCAAGCACCACTGCACTCCAACCTGGAAGAAAGAG- 

CGAGACTCTGTCTCAAAAAAAAAGAGTGAAACTCTGTCTCAAAAATAAATAAATAA- 

ATAAACCCCAAAACACACAGACATACACATTATTTCATTGAATCCCCGTCACAATTCTA- 

TAGGGTAGATATTATTAATCTCTCTTCACAGACGGGAAACAGAGTTTCGGACAAGTAAT- 
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TTATCTTCAGTCACACAOCAAGTTAGCAGTGAAGAGAGACTCCAGCCCATCTGCT. 
TAACTCACTGATCTCACACCTCAAAATATTAATAAATTATTATAACTAATATGGTAGC- 
TATTTATTTGAGACTGGGTCTCACTCTGTCACCCAGGCTGGAGTGCAGTGGCGCTAT- 
CACAGCTCACTGCAGCCTGGATCTCCCAGGCTTAAATGATCCTCCCACCTCAGCATCC- 
5 TGAGTAGCTGGGACTACAGGCGCCCACTACCATGCCCGGCAGATTTlTTGTACTTT- 
T A I I M lA GTAAAGTCTATnTAGTTTCACTATGTTGCCCAGGCTGGTCTTGAACTCCA- 
GAGCTCAAGCAATCCTGTCTGCATTAGCCCACCAAACTGCTAGGATTACAAGGGT- 
GAGCCACGGTGCCTGGCTAATATGGTAGCH-ATTGATAGarTACTATGTATCAGATCC. 
TATTTATTTATTTATTTTTGAGACAGAGTCT^ 

10 GGCATGATC7TGGCTCACTGCCACCTCCGCCTCCTTGGCTCAAGCTGAGTAGCTAG- 
GACTACAGTGGTG AGCCACCATGOCC AGCT AA1 II M M 1 1 1 1 II I II 11 m 1 1 GAT A- 
GAGATGGGATTTCATCATGTTGTCCAGGCTGGTCTTGAACTCCTGACCTCAAGTGATC- 
TGCCCACCTCGGCCTCCCAAAGTGCTGGGATTACAGGTGTGAGCAACTGCACCTGGC- 
CCATCAGGTGCTGTTTTAAAGGCTTTATATGAATTTAATAACATATGTCAATAG- 

1 5 GATCGATTCTATCATTATTTQC C \ I I 1 11 N U U I HI U 1 1 1 GAQGCAQAGTCTCCC- 

CGTCACCCAGGATGGACTGCAGTGGCGCAATCTCGGCTCACTGCAACCTCCACCTCC- 
GGGGTCCAAGTGATTCTCCTGCCTCA GCCTC GCAAGTAGCTGGGACTACAGGCGCO 
CGCCACCATGCCTGGCTAATI I I IGTAI I 1 1 ! AGTAGAG ATGGGGTTTCATATTGGC- 
CAGGCTGGTCTCGAACTTCTGACTTTGTGATCCGCCCGCCTCGGCCTCCCAAAGTGC- 

20 TGGGATTACAGGCATGAGCCACCGTGCCCGGCCCATTATTTCCCTTTTACACTCAA- 
GAAAATTGAGGCCCAGTGAGGTTAAGTGACTTGCCCAAGGTCACACAGCGTGGAAC- 
CAGGCAGTCTGGCTTCAGGGTCCACACTTAACCTTTGAGCTATCCCTGGCTCCTACC- 
CAAATTCCCAAACTCACCTGGCCTAGCTCTCTGCAGGGACAGTGCTTGTAAAGAGG- 
CATTTGGCTGTGATCTCCCCACCTCCCAGGGCTGGTCTGGTCCCCCTGCCATTTGTCC- 

25 TCCXrrrCACCCAGTCCTCTAGGGCCCTCATTGCTGACTCACCTTCGTTCACAGGGGC- 
CATGTCTGTTGGGGATGCTGGGGGGCTGGGGTAGGGGTTTGGGGTTGGGTCTGGGG- 
CTGTGGGGGdAGCTGGGGCTGTGGTTGTGATTGTGGCTGGGGCTGTGGTTGTGGTT- 
GGGGCTGCAGCTTAGGCGGGGGTGCTCGGGTGAAGAGGGGGGACCCAGGGAGCAT- 
GGCGCGGCTGGCCCCGTGCTCCCAGAAGGCGTTCTGCAGCTTGAAGATCATGCT. 

30 GAGGGGGATGGGACGCTGGCGCGGGGCCCCGCGGGGCTGGGGGCTGGAGGGGGG- 
CATGGGGATGCGGCTGACGGGCTGCCAGCTGCGAGGCAAAGTGCCCGAGGGCCC- 
CGCGGAGCCCAGCGAGCGCCGGTAGCTGCCCGCGTCTGAACGCCGGTCGCTGGC- 
CAGAGGAGAGACCTTGTAATTGCGCGGCAGGGTGGCGCTAGTGAGGTTGTCX5TGGG- 
GAAGAGGGAAGGGAGAAGGGGATCGGGTGAGAGAGGGAAGGTGGAGGGGAGG- 

35 TAAAGACAAAAGACGAGAAGGGAGAGGAGGTGAGGGAAiSCCCTGGGAGTGAGGGA- 
GAAGAAAGGGTGAGGAAGGAGCAGAAACCCAGCACAGTGAAGGGAGAGCGTGG- 
GAACGGGCGCCGAGACCCAGATCGCAGCCCCGAGGGGGAGACTGGCCTTGACCC- 
CGCTCCCCCACCCCACTCCTCGACCTTCCCCAGCCTCTCCTCCCCAGGCGTCGCCTC- 
CTCACCTTGCCGGTGCCCCCCAGTCCATCCAGGCTGCTCTCCCTCCAAGGCAACAGC- 

40 TGCAGGCTCGGCGAGGCAGGCCTTGCGAAGACGTCCAGGCCTGCGGGGCGGGAAT- 
CATTAGGGTCTGTGGGGCTGCCTCTCCTCCGGGTCCTCCATTCCCCGGGCCTCCAC- 
CACTCACGTTCATAGCTCGCTGTCTGCGAAGGCTTCTTCTCGTACGCCACGTCCAGGT- 
CAGACTCGTTCCAGGCTTTCGGAGGCCGCCGGCGCAGCGTCAGGTCGTCTGGGGA- 
GAAGTTTCCAGGGAGGATGAGACGGGAGGGGTGGCGAGCCCCGGATCCTGCCCGCT* 

45 TTGACCCCGCGAGTCAAAGGCCCGGCGAGGGGCCCCTGGGTTCACCTTGCGCGCG- 

CAGAGGCGGGGCGAATGCGCTGCCGCCGGAGCCTAGCAGGGAGCTCCCGAAGGCG-. 
GACGCTGGCGCGTCGTAGGCTGTGGCAGGGGGGCGCGGTGACGGCCCACGCTCGG- 
GGAAGAAGGCCTGGGGCCCCTCCGCCAGGGGGCTGCCGCGGGGGGAGCCTGCG- 
CGGCCCAGGAAGTCGAAAGGCGTGGGGGGACCCTGCTGGCGGAGCGGGCCTGGCC- 

50 CGGGCCGCGGGGAGGGCGCACGGCCGAGG3AGCTGCCTGCGCCATCGAAGGCG- 
CGGGGpCGGGGCGAGGTCGCGCGGTCCAGGCTGCCGTAGGCGTCCGGCTGCAGG* 
TAGAGCGGGGTGCGCGGCGACGACGGCCGTGCGTTGGGGGACAGCGGGCTGTAGG- 
GGTGTAGGGTTGGGGCACTCTCTGATCGTCCGAACGGGGTGTCTGCGCCGTCGGT- 
GGCCGCCTTGCGGGGGGACCCTCGGCTGCCGAAGGGCTCAGGGATCGAGCTGGAG- 

55 CTGTACQGGGGCGGCTGTGGGGAGGCCAGGGCATTGAGGGATGGATCAAAGGAGA- 
CATTAGTGGAAGGGTTGGTGTGTGGGCGGGGGTGTCAAGAGAGATCACTGGAGGT- 
CAACCCAGAGGAGGCTGACCGGCCATGGAAATTCAGGCACAGAGAGCCCAGGTGAG- 
TAGTGGTGGGGAGACAGCCCTGAATCAGCACTGTGGCTAGCCCATTACTCTATGT- 
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CACCTTTATGCCACTTAGGTAAACACCTCTTTCCTTCTGAGGGTCCCTTTAC3ATGTC^ 

CACTTCCACTGGTCCCCT U I 1 1C TATTTC H I C I I 1 CTTTCTTTCTCTCTCTTTCTTT- 

TCTTTCn I CM 1(^TCTCTCTCCTTCCTTCCmCTCrrCTCTCTCCTTCCCTCCCTCCC- 

TCCCTCCCTGCTTGCTTGCTTTCTCTCTCTCTCTTTCTTTC I rTCTTTCTI I CTTTCTT- 

TCn ICIHClllCIII ICTATCTCGGCTQATTGCAGCCTCAACCTCCCTGGCTTAGT- 

GTGATCCTCCCACTTCAGCCTCCCAAGTAGCTGGGATTACAGGTATGCACCACCA- 

CACCTGGCTAACTTTTGTATTTTTAGTAGAGACAGGGTTTCACCATGTTAGCCAGGCT* 

GQTCTTAAACTCCTGACCTCAAGTGATCCGCCTGTCTGTGAAAGTGTTGAGATTA- 

CAGGCGTGAACCACCGTGCODAGCC^GATTTTTAAAAAATCATTTGTAGAG 

TCAAACTCTTAGTCTCAAGCAATTCTCTCACCTCGCCTTCCAAAGTGCTGGGATTC- 

CAGGTCTGAGCCATCGCGCCTGGCCTGGTCCCCTTTTTTCAAGTTCCCTTGAAGAGC- 

CC ACAACCTGCATAACT ATATGGGGCAATTTTGCCTGAAATCCAGGCCTCTGGTCT G- 

GACTGTGGCGAGAGGCTGGCTTTGGAGATCAAGGTGGGAACCAGGCTTACCCTA- 

GAAGGGGGTCGGGCCTGCGGGCCAGGAGGCGCGGGAGAGTCTGACCACAGCGAC- 

TCCAGCTGCTTGQTCAGTTCATCCACCTTGGCCGCCGCCGTGTCCAGCTCCATCTGC- 

TTCAGATCCATGTGTTTCATGGCCAGCGCTGGGAAGGTGGGAGTGGAGGTAAGGACC- 

TGGCCTCCTGGCAGGGGCCGGCCTCAGCACCCCTCGCCCGCTGCCGAGGTCCCCG- 

CCTCGCCAGCCCCGCCCCCTACTCCAGCTTACACTGGAAGTTCATGTCCAGAAAGTC- 

CCGCGCGCTCTGGAATGCCTCGCTGTCCATGGTGCCGGCCGGAGCGGGCGCCTG- 

CATGGTGGGGAGGGAGGGAGCTGGCTAAGACCCCGCCCCTCTAGACCCCGCCCT- 

CAGGGAGTCAGACGCCGTCAGGAGCGGGACAACGCCTCAACTCAGTTCCTTCCCCT- 

GGAAGCCCTTTACCC7TTCACCTCCCCAGCTGGGAAATGCCAACTCCTCCAAAGC- 

CAAGTCC ATGCGCCACGG AGAAGTCCAAACCCAGTCTAAAACCTCCGGAATTCACTT- 

TCTCTTTC I J I I J 1 1 CI I I ICI I I 1 1 I 1 1 1 1 I I M I I I GTGTATGTGTGTGAG ACA- 

GAGTCTCGCTCTGTCGCCCAGGCGGGAGTGCAATGACGCGATCTTGGCTCACTG. 

CAACCTCCGCCTCCCGGGTTCAAGCAAATCTTCTGCCTAGCTGGGACTACAAGCGCG- 

CGCCATTATGCCCGGCTAAl 1 1 1 1 GTAGTTCTGGGATTACAGGAGTGAGTCTCCGCGC- 

CCGGCCGTGTCCATCTCTTTATCTCAGTCCTAAGACCTGAATCACTCCTTGAACAAT- 

TATCTATTGATCACCTACAATGTGGCGGTAAACATAGGATGGAATAACTATGAATTACT- 

GAATGTTTACTAGG GAGC AGGACGCACTQTGCTAGATCCTGI I I T TGTTTG l ' l I I I G A- 

GATGGTGTCTCGCATTTTCGCCCAGGCTGGAGTGCAGTGGCGCGATCTCGGCTCACT- 

GCAAGCTCCGCCTCCAGGGTTCATGCCAGTCTCCTGTCTCAGCCTCCCGAGTAGCTG- 

GGACTACAGGCGCCTGCCACCATGCCTGGCTAAATTTTTGTATTTTTAGTAGAGACGG- 

GGTTTCACCGTGTCAGCCAGGATGGTCTCGATCTCCTGACCGCGTGATCCATCTGCC- 

TCGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGCCACCGCGCCCGGCCCTTGTTr- 

TTGTTTTTTAATAATAATTCTGCTGTCTGCTGTGTACTAGAACCCATGCCTACTGCTTG" 

GGGTATAATGTAGTAAATGTAGTAAAAACAATATCCGCCGGGCGCGGTGGCTCACGC- 

CTGTAATTCCAGCACTTTGGGAGGCCAAGGAGGGCGGATCACGAGGTCAGGAGAG- 

CGAGACCATCCTGGCTAACATGGTGAAACCCCGTCTCTACTAAAAATACCAAAAAT' 

TAGCCAGGCGTGGTGATGGACGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGGCAG- 

GAGAACGGCGTGAACCCGGGAGGTGGAGCTTGAACTGAGCGGAGATCGCGCCACTG- 

CACTCCAGCCTGGGCGACAGTGCGAGACTCCGTCTTAAAACAAACAAATAAATAA- 

ATATGTTTAAAACAACAACAACAATAACCAGCCAGGCGCGGTGGTTCACTCCTGTAAC- 

CCGAGCACTTTGGGAGGCCGAGGTGGATGGAT<iGCTTGAAGCCAGGAGACCAGCCT- . 

GGCCAATATGGTGAAACCCCGTCTCTACAAAAAAATACAAAAGTTAGCTGGGCATGGT- 

GGCATGTGCCTGTAATCCCAGCTACTCAGGAGGCTGAGGCACAAGGCTCACTTGAAC- 

CTGGGAGGCACAGGTTGCAGTGAGCATAGATTGTGTCACTGCACTGCAGCTTGGGT- 

GACAGAGCGAGGCTCTATTTAAAAAAAAAAAAATTAATTGAGGGGCCACTCCCTTCTA- 

GAGTGGTGAGAAATGCCGTGCACCGAAAGCTTCATTTGATGGTCAAAACCACCCTAG- 

CAGGCAAGAAAGCATGGCTCAGAAACATATGTTCAAGGTCACCCTGCAAGAAGTCGG- 

TAGTAATCGGTTTCACACCCGCATCTAACTTATTCTGGGTCATCTCTACCAGATTAGAG- 

GGGTCCTAGAGGGAAGCGACT6CTCAGCTTCCTTTCCCTAGGGTCCCCATTCAGTG- 

GAGGTCTGGCTCTCACTGACCCATTGTTAGCAAGAGGAACAGGGAGGTGGCCAGGG- 

GTGGAGGGGCAGCTGTGGTCACTGGCCCAGTGGGAGGGAGCTAGGCCACTAGGAAC- 

CGGTCAGGCCAGCACCATCCCTATCCCCATGCTAGCCACCACACCCACCAGCTCTGC- 

CACCTCCCTGCTGCATCGACCACTTAGCTCTGGCAGTATAGGCAGCAGGGCAGGCTG* 

GGGCATGCTGATACCCGCCTCTGTCTGGGAAGTCGAAGGAACAGAACCTGrrCAGGC- 

TGGCGGCTCATTTGGATGAACAGGGAGTGTGTGACCTTGGGCGTTGAGTCCTCTC- 
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CACTCCCTGG6CCTCAGTCTCCCCAACATC/y\AGAAGAAGGCAAATCA0CI II 1 1 I ITI- 
TTTTTTGAGATAGGGTCTCGCTCTGTAACCCAGGCTACMTTGTGACTCACTACAGC^ 
TCTTGACCTCCCAGCTCAAGTGGTCCTCCCACCTCAG CCTCGTGAGTAGCTQAGACT A- 
TAGGTATAGCCTCGCACCACCACACCCAGCTAAI I 1 1 M II I M 1 1 1 1 1 \ I I II 1 1 1 1 I I- 
5 TTTTTTTGAGACGGAGTCTTGCTCTGTCGCCCAGGCTGGAGTTCAGTGGCGGGATC- 

TOGGCTCACTGCAAGCTCCGCCTCCCGQGTTCACGCCATTCTC CCGCCT C AGCCT CC" 
CAAGTAGCTGGGACTACAGGCGCCCGCCACTACGCCCGGCTAATTTTTGTATTTTAG- 
TAGAGACGGGGTTTCACCATTTTAGCCGGGATGGTCTCGATCTCCTGACCTCATGATC- 
CGCCCGCCTCGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGCCACCGCGCCCGG- 

10 CCACCGAGCTAATTTTTTAAAAACATTTTGTACACTTTGGGAGGCTAAGGCGGGAG- 

GATCACGAGGTCAGGAGCTCGAGACCATCCTGGCTAACACAGGTGAAACCCTGTCTO 
TACTAAAAAATACAAAAAAATTAGCTGGGCGTGGTGGCGGGCGCCTQTA QTCCC AGO 
TACTCGGGAGGCTGAGGCAGGAGAATGGTGTGAACCAGGGAGGCGGAGCTTTCAGT- 
GAGpCGAGATCGCGCCACTGCACTCCAGCCTCGGAGACAGAGCGAGACTCCGTCC- 

1 5 CAAAAAAAAAAAAAAAAAAAATTTGTAGAGACAGATCAAGTCTCAfcTTTGTTGCTCAGG- 
CTGGTTTTGAACTCCTGGGCTCAAGCAATCCTCCCGCCTCAGCCTCCCAAAGTGCT- 
GAGATTACAGGCATGAGCCACCACACCTGGCCAAATCAGCTATTCTGAAAGGCCCCTT- 
* TAATCTCTATGAGCCCCAGAfcTTTCAAACTGTAAGGACCTTAGGA CTGTAA CTAAAQT- 
TCTACAGAGCCTAAACCCCTCAGCTAAAGAGCCTATTGTTGGAAAGTTCTGAGTCCAA- 

20 GATTCTATCTTTGGAACATTCTAGMTTCTCCAATTTGTCTAACOCAGAATTCTGAGTCT- 
TTCTGTACCACATTeTACCTAACCCAGGGTTGCACTGCTCTGGAAGTCTAGATGGATG- 
GTATAGTGCAGCTGGTAAAAGCATGAGTAAGAAGTCAGACTTCAAAAATTCAAATCT- 
GAGGGCCGGGCATGGTAGCTTCTGCCTGTAATCCTTGCACTTTGGGAGGCCGAGGG- 
GGGAGGATCACTTGAGGCCAGGAGTTCAAGACCAACATGGCCAACACAATGAGACCC- 

25 CATTTCTTAAAAAAAATTAAAATAAAATCATCAAATCTGGCAGC^CCACCGTCCaACCC- 

tgaccacagtacctcagtctcgtaatccgtaaaatggggatgaaagttcacctcatag- 
gactactgtaagaatccacctggtcagaaggtgcaggaagaattcagagctctgaga- 

ATTGAGGCCTCAGGAAGAAGAGACTACAGGAATAAAAACTCGGGCATTTAGAATTTCA- 
QAGATACACAAACAATACTTrGTTAACTGT TAAAATAG ATAAATGAGCAAGTCTGTG- 

30 CAGCCCTAATGCCAGCTGTAAGTGACTC \ II I 1 1 1 1 ICI I I I GGTAGAGATTTAGTCTC- 
TCTCGCGCCTGTGGTTAGGCTGGTCTCGAACTCCTAGCCTCATGGGATCCTCCCCGG- 
CTCGATCTCCCAAAGTATTGGGATTACAGGCGTGAGCACGGCGCCATGATCCCGAA- 
ATTTCCAAGATTCTCAGATTCCATACTGACATTGTCTGGCTCTCAGGAAATGCCAACCC- 
TGGGTGTGGGGCTGTCGCGGGGACAGGCGGTGGGGACGTCGGAGCCACCAGGGGG- 

35 CGGTCACGCCCGGACCCCCGCCAGGAGGGCGGACTGCGCCTGAGCTCAGGCCCGG- 
GGAATGCGCAGCGGGCCCGGGCAGGTGCTGTACATCCCGGGGCAAGGGAGCTGGG" 
CCGGGCGGGGTACAAGGGCGGGGCGCGGGGGTGGCGCGGGCCGTGTGTCTGTTCC- 
CAGGCCTCTGCCCCTGACCTCTGCCTCCGAGTCCTCTCCCATGTGCTCCCCTCTAGC- 
TCTAGCTCCGAGCTCTCCCGCGGGCTCTGGGCCAGCCGCAGGTACTCTCCCCTGGG- 

40 CTCCTCTCTCCGCTCCACCCCTGGCTCTCCTTCCCTGGCCTCCTCTGOACCCCAGC- 
CAGGTTCTTTAGGGCTAAGGATCCTGTGGACTrCCTGGAGGAGTCATCTTCAGTAG- 
GAACCGGGtCAGAGAGCCAGACTGAGCTGGGAACACCCAGGCTGGACTCCTACAGC- 
CCTGTCGGGTCACACTGAATCTGGAGAGGCTCCACTGTCTCTGGGACTCGGTTTCC- 
TCCTTTGTGGACGTCTATGGAATGGGCTAGGGCCTTTCTTGCTCTAAGGCTCTACTTG- 

45 GGCTTGTTATTTAGCTTCTCTGTGCCTGTTTCCTCATGTGGACCATGGGAAGAATTA- 
ATACCTTCGCCTCAAAGGGGTATGAGGATTGAGTGACATAATTTATAAGCCGTGATTA- 
GAACAATGCAGTGCGCGAAATAAAGTTCACACATACAGGATTCATAATTACCAGAT- 
GTCCTTGGCTGTTCATTATAATAAGACAGGGTCTGGCAACAGAGTGAGGGGTCCAGAC- 
TCAATGTA AI I I 1 1 I 1 I I CCCCTAAAAGGGCCCTTTCAACTCTTTCTGAGATCATACAAG- 

50 CCCTGAGTTTTGACACCCAGGGTCTCAACTTCCTGAGCCCTTGCCTCTCAGAGTCC- 
TAAATTTCCCCTGTACATTCCTGAGTCTGGCCAGTGATCACCCTCAGTCACTTAGG-. 
GACGGGAGGGCTGGGAGAGCCCTGGAAGATTCCAGACAGAAGCTGGCAAAAGCC- 
CAGGGTGrrGGGCAATATCCACTCTCCAGCCTCCGTTTCTCCACTCGTAATGAG- 
GAGTCCTTCCCTGGGGTCAGCAAACCTTATTCAAAGGGAGACCTCTCAGTCACCCAA- 

55 GATTCCTCTAGACAATGCGAGCTTTCCTACCTACCTACCTACCAGCTCTGAGCTrGG- 
TACACCCAGAGCCCTGTTTTGGCAACCACGGTTATTATTT7TAATTTCAT7TCAGGT- 
TATCATCAAATGCCCTTCAAGCCCAGACATTGGGAAACACTCCTCTCTCATCAGATGC- 
TCGCCTCCCCCATTCTG 1 1 1 1 I A ATCCCCCTTCTTAGGACGCATGGGGGTTGAGA- 
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OAACGGGGAGATAGACAGAOOQACGTGCCTGGTCCTGCCCTCCCCCCGCCTCAAG- 

GACAGACAGACACCTCCAGAATTAGCCTCTGTCCCTCCTTATCTCCCACAATACCC- 

CAGGTCAGACAGATGGGCGTGGAGGTGACATTTCTCACCTCAGGGTCAGGGCAAG- 

GAGCCCTG AGGC AG AAG GTTAGTCAG AAAATCTG G CGGGG GCG G ATG G AATCC- 

CGTCCCCCAGAGAGCTGCAGAAGAAGGAGGAGGCAGAATCCTGACCCTACAAACTC- 

TACTGCCTGTGTGAGCTCCAAGCCTCAGTTTAC^CCTTCCTCTCCGTGTAATGGTTAAr 

ATGCCCG^CTATGCAAAdCTCCCAGAATCCAATAGCCGCTTTCCGGAATTCTGCCCT- 

GGGTTCTAGAACTACCTCTGCAAACCCAGCTGTTTCCCACCCCATAAGGCAATAGGG- 

GAGCCCACCTCCGCCAGGGGGTGCCCTAGGGCGGATGTCCCTTCTCTGGTTAGG- 

CAGGTCTGACGCCCAGGTTAATGACATGTTGGGTTCGCTCAGCGGCACAGAGGAG- 

GTTGGAGATCTGCCTCGGTGTTTTCTCTCCTACCCCGCCCCCATCCCCGAGC- 

CGAAAAGTCGGGGGAGAGCCGGGACACAGCCTCCGGAGGGACCCCGGGTACCT- 

GTCCTGCTCCACTTCAGGMCC^AGGCTCCACTATCCCTGCCCCACCCTTAATTCTGC- 

TCAGAGACCTAGAAGATCGGTCGAGACAGCAGCTTGAGGCTGGCAGGGTGGTCACC- 

CATTCCACCTTGAGCCCCACCAGTCTGAGCCTCTCATTTCTGACCAAGACTCGGGGAT- 

TCGAACCCCTATACTACCCAAAGACTCGGCTTCCTAGAGCCCCCCAGTTCGAGGGAC- 

TCAGGAATTCCAGCTCCAACGTCTCCCCGGGATGAAGGGGTAGAATCCCTCCATTC- 

CAAGAATTCAGGCATCCGAACCCGCTTTCCTTCCCTCCAGTAAAACAGGCAACGGAGT- 

TTCCrrCTAAGGATCCAGGTGTCGGCGCGCCCCAAATTCCGCCCTGGGACCTGG- 

CGTCCGAGTCCCCTCCCAATCCTCCCAGQGACGCGGGTGTTGGG C1 \ I I T C AGGGCO 

TCTGGTCCCCAGGAGGGTGAAACTCACGGATCCGGGCAGATCCTGGCACCTGGGGG- 

CTTCGTCCAGCTCGGGCTCCGGCTTGGGGAGCGGAGAACGGGGCGGGGCAGGAGC* 

TGGGAACAGGTTAGACGACGTGACTTGGGCTGGAGGGAGGCGGGTCCCGGTGGG- 

GAGGGGGAGCCAAGGTCGCCTCGAGCACCTTGGGACTTGTAGTCCCGGAGGGACAG- 

GACGTAGCCCAAGACGATCCCATTTGGATTCACCCAGAGTCCATTTCACAGACAG- 

GAAGGGCGAGGCCCAGAAGCCGAGAGCGACCAGGCCAGGGAGATACAGAAGAGC- 

CGAGACGCCTGCCTCGCTGTGGCTGGAGACTGACTCCTGAGCCCTTGCCCCACCCCT- 

TCAGGCGCACTATCCCCTTTCCTGATCAGTATCCCCCAGGGTCTCTGAGCCCGAATC- 

TCCCCGTCGATAAAAAGCGCGGGTTGGATCTTCAAAGGATGTCCCAGCAAGAGTT- 

CAAAATCTTAGTTTGGACTACAACCCCCAGCAGCCTCCGCGACCGCCCTCGGGCGAC- 

TCTTTGCCTCGGGTCCTGTGGGAATTGTAGTCCTGGAGCCCGCAGGGCTGCACCC- 

CGGTGTCTCTCTCGCCCACGCGAAGGAAACCGTCTGGAGATCCTGGATAGGGGAAA- 

GATTTCCCCTTCCCGTTGACCCTCCCTCCGCTCTGGAAAGCCTCTCCCACCTGGGGA- 

GAAGGGGTGCCCCAATTCTGGAGTAGGATCCTAAATCTTGGCAGAGGGGGCGG- 

GAAGTGGCGCTGACACACTGGCCAGGAATGCAGTCGGGTCACCCTGTCTAGCCAC- 

CGTCTCGCGGCTCCAACCGCCGCCCAACGCGGGGCGGCCCCAGTGGGAAGG- 

GAAGTGGGTGCGTCCCCCAAATCTGTGTCCACGTGCCGCTGTTTACACGCTCCCTGG- 

GGGAGGGAGGAGTCGCCGATCAGGTCCCTTCCTGAAAGTCATCGAGGTTTCCCACG- 

CATQAGACTAAACCCCCGAGGGCATCTACAAGTCCCATTrGATCCACAAACGCTACAO 

CGTGCCCAGCACCACTCCACGCGTGTGGGGCTCCTGGGTCCGAGGCTCCGCCC- 

TCGAGAACCACAAGCTCCTCCCCCTATGTTTCCCGCTCCCCCGGAGTCCAGAAGCCC- 

CGCCCCTGGCTGGAACTTGACGCCCTCCGGACGGATTGCCCCTATTTCTCCATTTTCC- 

CGCTTCTCCCAGTCAAGTTCTGAACTTGTGAGGCATCTGGGCCTCCCCAGAAGACATT- 

TAACACAGAAAGCACAGCCCTACTAACTAGTATTCTTACCTGTCTCTTCAAGAATTTCA- 

GACCAATCGACCGTCCTGTCTCTTTAAGGCTTAGGAAGAGCAGTGTGGCTGCCCCTT- 

TAAGGAGGCGTTGCAACAAACCATATTGGACAGACGATGGGGGCGACCCATCGG- 

G ACCCGA CGGGCCTCTGACTCCAGCAATACAGCQAATCAGCGGGTTTCGGGAATA- 

CATT TTTCGG AAAAAGACTTCTTCCTCGGTTTTCTGCTCTGCACACGTTGAAATTTTCC- 

CCAGTTT7TCCTGCAGATCGGGAGTCGAGCAATGCCTACCCCCGCGCTCCCGCAC- 

CAGTTGGGCGCTCCCGGATGATGCCCTACCCCTTTGGATCCACGTGGTCTGCAACCT- 

GGTGCGAGCAGCCCGGGCTACAGGGTTGCCTGAGGTGTGGGTCCCAGGATGGAG- 

GAGCCCCAGGCCGGCGGTGAGGGTGCGGGTTGACGGGGTGCGGAGGGTGCG7TG- 

GTGGAAGGAGAAAGGGGCGTCCGAGAGGGTTCGGGCGGAAAAGGAGGCGTACCTG- 

CAAGCAGGACTTGCGAAGAGCGTGCATTCCCAGTGGGCGAACGGGAATTCGAACG- 

GAGAGAGGGTTATCTTGTGGGGGGCTACCCGTGGAGAGCAAGGCGCCCCCAGGG- 

GTTGGATCGGTGAAATTGAGGTCGCCCCTGGGGAACAGGTGGGCAGAAAGGA- 

GAAACCAGGTTGAGGGGACTGGAGTGCTCACGAGGTTAAGACCAATGGACCGA- 

TAGGCGCGCCCTGCAAGATTGGACCGGCAAGGAGGTGTCAGTCGACCCCATTTCCCC- 
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ttctgctgcagatgctgctcggttctcttgtccccccaactttaccgcgaagcccc- 
cagcctcagagtcccctcgtttctccttggagopgctgacgggtccagatacggagc 
tgtggcttattcaggcccctgcagactrtgccccagaatggtgagtggtcrrgtt- 
gacggaaaagagggtcccggtccagaccccaagagcgggttcttgaatttgtcacag- 
5 gaaagaattagaggtgagtcacagaqcacagtgaaagaaacaagtttattqgaaac- 
tactcctttacagagtagagtgtcctcagaaagcagggggagaaacccacagccct- 
ttgttagtatttctagttataagaaactataaggaactatagttaaacttggagtgtg- 
cagataagctcactaaaggtaggggctattggtgttatccacgaccattaatcctg- 
caacctaagcttgctcatttatgttatatttaagtaatgggggctgcattcttagga- 

1 0 catttggacattctgcaggcttggtggaacatgttctgtatggccataaatattctgta- 
attataattggtggtcagcct6ggatgtggttattttca6gccataagcatgaacctt* 
gtaagtgcctagctactcactttaagatggagtcact^ 
cagaggccagccaggcgcagtggctggtgcctgtaatcccatcctttgggaggc- 
cgaggcgagcagatcacvtgaggtcagg agttcaagacc agcctggccaacatagt- 

15 ' gamttgtctctactaaaaatacaaaaattggctgggcgtggtggcaggtgcctgta- 
atcccagctacttgagaggctgaggcaggagaatcgcttgaacccaggaggtgga- 
cattgcagtgagccgagatcatqcgactgcactccagcctaggcaacagagcaagac- 
tctctc aaaaaaaaac aaaaaaaaaatcaaaaaac cttccctctcctgttccacttaag- 
cctctgccotccctgtttctctctgtagcttcaatgggcggcatgtgcctctctctgg- 

20 ctcccagatcgtcaagggcaaattggcaggcaagcggcaccgctatcgagtcctcag. 
cagctgtccccaagctggagaagcgaccctgctggccccctcaacggaggcaggaG" 
gtggactcacctgtgcctcagccgcccagggcaccctaaggatccttgagggtccc- 
cagcaatccctgtcagggagccctctgcagcccatcccagcaagtcccccaccaca- 
gatccctcctggcctgaggcctcggttctgtgcctttgggggcaacccaccagtca- 

25 cagggcctaggtcagccttgqcccccaacctgctcacctcagggaagaagaaaaag- 
gagatgcaggtgacagaggccccagtcactcaggaggcagtgaatgggcacggggc- 
cctggaggtggacatggctttggggtcgccagaaatggatgtgcggaagaagaa- 
gaagaaaaaaaatcagcagctgaaagaaccagaggcagcagggcctgtggggaca- 
gagcccacagtggagacactggagcctctgggagtgctgttcccgtccaccaccaa- 

30 gaagaggaagaagcccaaagggaaagaaaccttcgagccagaagacaagacagt- 
gmgcaggaacagattaacactgagcctctagaagacacagtcctgtccccgac- 
caaaaagagaaagaggcaaaaggggacggaagggatggagccagaggagggggt- 
gacagttgaotctcagccacaggtgaaggtggagccactggaggaagccatccctct- 
gccccctacgaagaagaggaaaaaagaaaagggacagatggcaatgatggagccag- 

35 ogacggaggcgatggagccagtggagccggagatgaagcctctggagtccccagg- 
ggggaccatggcgcctcaacagccagaaggagcgaagcctcaggccca6gcagctc- 
tggcagctcccaaaaagaagacgaagaaagaaaaacagcaagatgccacagtggagc- 
cagagacagaggtggtggggcctgagctgccggatgaccttgagcctcaggcagc- 
tcccacatccaccaagaagaagaagaagaagaaagagagaggtcacacagtgact- 

40 gagccaattcagccactagagcctgaactgccaggggagggacagcotgaagccag- 
ggcaactccgggatccaccaagaagaggaagaagcagagtcaggaaagccggatgc- 
cagagacagtgccccaagaggagatgccagggccgccactgaattcagagtctggg- 
gaggaggctcccacaggcggggacaagaagcggaagcagcagcagcagcagcct- 
gtgtagtctgcccccgggaaactgaggaactaaagaaagctgaaggtgcccacctg- 

45 ggccaccagaaggtgacacccccagaatccctccccagagactgcaccagcgcagcc 

Sequence of the s region of chromosome 19 

The following depicts the region s as described above. 
50 More specifically s Is bounded by and includes the following two sequences: 
GGCGCCGGCCGGACTGTGCAG and ccagagactgcac- 

CAGCGCAGCCCAGCTTGAGCAAGATAGCG , and Is defined by SEQ ID NO: 2 
herein below: 
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GGC6CCGGCCGGACTQTGCA6C6G6GTCQACCCGCCTCCCTCATGAAT 
ATTCAGCGAGAGGCCGGGTCGTGGACATCCTCGAGGGCTCGCTCCACCT 
rATTACGAGACCATTGGCTAACCTGCCCGTCAATCCGCTAGGGCAGAGCAATC 
GGGATACTGCGCGTGCGCACGGAAAAGCGAGGGCGGCTGACTCTCGGGT 
5 GAGGCGGTGCGGGAGGCGTCACTG AGGATCGTCGAGGGCC AATCAAAA 
GAAAACATGGAAGGGAAAGAGCCGAGAGACTCGATCTCATTCACTAGAA 
TTTGGTCCTCCTGCGCCTGCCAAGATrGTCTGA 
GTATTGATCGAACCCAGGAGTTCGAGATCAGCTTGAGCAAGATAGCG 

AGAACCCCCGCCCCTCCACCTCGTCTCAAAAAAAAAAAAAAATCGTCTCAGTAGCGAAT 

10 AGTCTAACGGAQAATGACAGGGAAATTGGTGATCCTTTCTGGGCCCAAGAGTTAGAA- 

ATGGCTTTGCAGGCCGGGCGCGGTGGCTCAAGCCTGTMTCCCAGCACTTTGG* 

GAGGCTGAGGCAGGTGGATCACCTGAGGTCGGGAGTTCAAGACCAGCCTGACCAA- 

CATGGAGAAAACCTGTCTCTACTAAAGATACAAAATTAGCCGGGCGTGCTGGCAAATG- 

CTTGTAATCCCAGCTACTCGGGAGGCTGAAGCAGGAGAATTGCTTGAACCTGGGAGG- 

1 5 CAGAGGTTGCAGTGAGCAGAGATGGCGCCGTCGCACTCTAGCCTGGGCAACAAAAG- 

CGAAACTCCATn*CAAATATTAATAATAATAACTAATAAAT/y^AACATAAATGCTAGCTT- 

TTGTT 1 G I 1 1 C 1 1 CAACAAATAGCTATGTGGCATCTACCATGTGTCTGATCCTGTGCT- 

GGCCCCTGGGAACAGAAAGGTGACCATGACAGCCTCAGCACCTGCCCTCAAAGAACA- 

GATTTTTTTCCTTGAGACAGGGTCTTTCTCTGTCGCGAAGGCTGGA 

20 CAGTCACAGCTCACTGCAGCCTCCACCTCTTGGGCTCAAGCGATCCTCCCACCTCAG- 

CTTCCAGAGTAGCTGGGACCACAGGTGTGCACCACCW^GCCCAGCTAAGTrrTATTTT- 

TTAAATTTTTTTAGAGACGAGGTGTCACCACGTTGCC^ 

GTTCAAGTGATCCTCTCCCCTCAGCCTTTCAAATTGTTGGGATTACAGGGGTGAGG'- 
CACCAGGCCTGGCCTOAAAGAACAGATATTAAATATACAAATGAATATATGATTACAGC- 

25 CTGGAGTGGTGGCTCGTGCCTGTGGTTCCAACACTTTGGAAGGCCAAGGCGAGTA- 
CATTGCTTGAGCTCAGGAGCTAGAGACCAGCCTGGGCAACATGGTGAAAACCCGTC- 
TCTACAAAAAATGCAAAAATTAGCTGGGCGTGGTGGCGTGCACCTGTAGTCCCAGA- 
TACTCAGGAGGCTGAGGTGGGAGAATCACCTGGGCCTGGGAGGCAGAGGTTGCAAT' 
GGGCAGTGATTGTGCCACTGCACTCCAGCCTGGGCAACAGGAGTGAAAACCTATCT- 

30 CAAATGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGCGCACGTGTATAATCACAAGTA- 
CAAAAGTGCTGTGAAGGAAAACTTCAAGTCACCATAAAGATTGATTATGGGCTGGGTG- 
CAGTGGCTCATGCCTGTAATCCCAGCACTTTGGGAGGCCAAGGCAGATGGAT- 
CACGAGGTCAGGAGTTCAAGACCAGCCTGGTCAACATGGTGAAACCCTATCTCTAC- 
TAAAAAAAAAAAAAAAAAAAAAAAAGCCAGGCATAGTGGCATGCATCTGTAATCCCATC- 

35 TACTCGGGAGGCTAAAGCAGGAGAATTGCTTGAACCCAGGAGGCAGAAGTGAGCCAA- 
GATCACGCCACTGCACTCCAGCCTGCGTGACAGAGCAAGACTCCGTCCCAGAAAAA- 
GAAAAAAAAAAAAGACTTATTATGACAGGATGTCTACTGTCAACTGTGGGGTGTTGAGT- 
GTTGGCCAAGTGATCAGAGAAGGCTTCGTGGAAGAAGCGAGGTTTGAGTAGAGCCA- 
GAAAATAATTAGAAGAGATCAACCAGCAAGAGGGGATGGATGAGaGAAGTGAGAAAG- 

40 GTGTTCCAGGGAGAGAGACCATCATACACAAAAGCTCTAGGCCAGAAGAAAGCT- 

gaggcctgtgagtgctgaaaggaagcctgtgggggtggagctctgagttgagca- 
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CAGGGAGCAGAGAAAGGGCAGCTGGAGGGGAAGGCAGGGGCAGATCGAAATCTCTT- 

ttttaaattaattaattcttaatttatttatttttga 

cagactggagtAcagtggcacaatctcagcgcaccgcaacctctgccacccaggct- 
caagcaattctctgqcctcagcctccctagtagctgggattacaggtgcgcaccac- 

S TACTGGCGAGCTAATTTTTATACTTITAGTAGAAACGGG 

ctggcctcaaactcctgacctcaaaagatccacccacttcagcctcccaaagtgctg- 

GGATTACAGGTGTGAGCCACCCTTCCCGGCTGTATTTTTGGAGACAGAGTCTTGCTCT- 
GTCCCAGCCTGGAGTATGGTGGTGTGMTTTGGCTCATTGCCACCTTGACCTCCAG- 
GGCTCAAGTGATCCTCCCACCTCAGCCTCCTGAGTAGCTGGGACTGCGGGTACACGA- 

1 0 CACCACGCCTGGTTAAI 1 1 1 II 1 1 AATTTTTTGTAGAGACGAGGGTATCTCACTATGTT* 
GTCCAGGCTGGTTGAACTCCTGAGCTCAAGCAATTCTCCCACCTCAGCCTCC- 
CAAAGTGGTGGGATTACAGACGTGAGCCACTGTGCGCGGCTTAATTTATTTACATAA- 
ATTTTTTTATGTTTACTTTTCTATCTCCT ACAGGAAG AAAATATATTTTGTTATTGACAG- 
GGTCTCGCTATGTTGCCCAGGCTGQTATTGGGCTCAAGCCATCCTGTTCCCTCAGCC- 

1 5 TCCCAAAGTACTGGGATTACAAGCGTGAGCCTCTGCATCCAGCCCAGATCCAAAATCT- 
TTACTGTCACCTACAGAGTCCTCTGTAACTAGCTTACTGCTCATCATCCCCATACCAAC- 
CCACCTTACTGCTCTGATCTCCTCCTCTCTCTCCCCCAGCTCATTTTGTTTCAGCTATG- 
CTGGTCTCCTTGCTGTCTCTAAAACATAACAAGCACATCCCATCTCAGGGCCTTTG- 
CACCAGCTATTTTGTCTGCCTGGAATGCTGTTTCCCCTGATAGCCATGTGGCTGACA- 

20 CACTCACCTCCCTCAGCTCTTTGCTCAATTGTCAACTTCTCGGCCCGGCATGGTGGCT- 
CACACCTGTAATCCTACCACTTTGGGAGGCTGAGGTGGGCAGATCACCTGAGATCAG- 
GAGTTCGAGACCAGCCTGGCCAAGATGGTGAAATCCCGTCTCTACTAAAAATACAAAA- 
ATTGGCAAAGCATGGTAGCACATACCAGTAATCCTAGCTACCCGGGAGGCTGAGG- 
CAGGAGAATTGCTGGAACCCGGGAGGCAGAGGCTGCAGTGAGCCAAGATCATGC- 

25 CACTGTACTCCAGCCTGGGTGACAAAGCAAGACTCTGTCTCAAAAAAAAAAAAGTCTC- 
CTTCTCAATGAGGGCTTCCTGACCACCAAATTAAATCTACCTCCTAGACACACACACA- 
CACQCACGCACGCACGCACACACACACACGCACGCACGCACACAGACACACAGACA- 
CACACTATATCCCCTTTCCCTGCTTTATTGTTCTTGAGAG 
CATGCTGAATATTTTACnTATTTATTTTGTTTAGAAAGCTCCTGGCTC 

30 TCACGCCTGTAATCCCAGCACTTTGGGAGGCTGGAACAGGTGGATCATGTGAGGT- 
CAGGAGTTCCAGACCAGCCTGACCAACACGGTGAAACCTCATCTCTATTAAAAATG- 
CAAAAATTAGCTGGGTGTGGTGTCGCATGCCTGTAATCCCAACTACTCAGAAGGCT* 
GAAGCAGGAGAATCGCTTGAACCTGGGAGGCAGAGGTTAACGCTGAGCCGAGATCG- 
CGCCATTGCACTCCAGCCTGGGCAACAAGAGTGAAACTCTGTCTCGAAAAAAA- 

35 CAAAAGTCAGCTCCATGGCAGGAGTGATGGCTCACGCCTATAATCCCAGCACTTTGT- 
GAGGCCGAGGCGGGCGGATCACTTGAGGTCAGGAGTTGGAGACCAGCCTGGCCAA- 
CATGGTGAAACCTCATCTCTACTAAAAATACAAAAATTAGCCGGGCGTGGTGACACAT- 
GTCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCTGGAGAATGGCTTGAACCTGG- 
GAGGTAGAGGTTGCAGTAAGCCAAGATCGCGCCATTGCTCTCCATCCTGGGCAACA- 



» 



65 



2<4^^t HO I BERG A/S - 



29/04 2003 12:35 PAX 3332<^ HOIBERG A/S -> ^gSS ® mB 

P687DK03 

63 

gactccgtctcagaaaqgaagaaagaaggaaagagagaaagaga6aaagagacaga- 
gagagagagagaaagggagaaagagagaaaggatggaaggaccctgacaagcact- 
gttgcataaaagtttc7tttctctctc1 1 li i m 1 1 1 ii n 1 1 1 1 ii 1 1 gagacagggtc- 
tcacttctgttgctccagctgaagtgcagtggtgagaacatggctcagtgcagcct- 
5 caacttcccaggcttaagtgatcctgccacgtcagcctcctgagtagctgggactg- 
taggtgtgcaccaccgtgcctagctaatttt^ 

cgacgttgcccaggctggtcttgaactcctgggcttaagggatctgcccgccatggc- 
ctcccaaagtgctgggattaccagcgtgagccactgtacccagcctgagtataggtt- 
tctgataaattttaggatcatattgtttggactgc;gtaagaatttccagaact 
10 gaagaaactgactggtttatattttattt^ 
tcactcttgttgcccaagctggattgca^ 

cgcctcgcggtttcaagtgattctcctgcctcagcctccccaggagctgggatta- 
caggcacccaccaccatgctcggct ai rtttttttl i atttttttatttttagtaga- 
©acggggtttcaccatgttggccaggctggtctcgaactcctgacctcaggtgatc- 
1 5 cacctgccttggcctcccaaagcgctgggattacaggcatgagccactgtgcaaggc- 
ctaggctggtttataaaattgctaaaccaagcagaacatgaattaaataccaaggaa- 
atactctcctagattgtcatgttacatcagccaatactaaaattgtcaagatacacaat- 
ttgaatgaactccatggtccaagtcgaattatctatgatattaccgatctaataaacag- 
cactatgtcccttaatgggagaaaaagttggagaatttaagagaatatcaatccaat- 

20 GTTGGTTGGGTGCAGTGAATCATGTCTATATTCCCAGCACTTTGGGAGGCCAAGG- 

CAGGAGGATCACTTGAGCCCAGGAATTCAAGGCCAGCCTCGGCAACACGGTGAGATO 
CTGTCTCTACGGAAAATTAAAAAAAAAAAAAGAGAGAGATTAGTGGGATGTGGTGCC- 
TATAGTCCCAGCTACTTGGGAGGCTGAGGCGGGAGGATCATTTAAGCCTGGGACGTT- 
GAGGTTGCAGTGAACCATGAGTGAGACTCATCTCAAAAAAAAAAAAAAAATGGCGAT- 

25 CACTAGAGGAAAAAAAAAQTAAAGTGGGGTrTGCGGGTAGTGGGAGGGCCCTTCCTG- 
CTAGGTTGCACTATGATCTCCAGGGAGGCTCCACGGGAGAATCATTTCCTTGTCTTTT- 
TCAGTTrCTAGAGCCAAATTCTTTGCATACCTTGCATTCC 

CTAACCTTCAAAGCTGGCAGCTAGCCTCTGGCTCAAGTGTCACATGGCCTGTCTCT- 
GTCTTCCTATCCAATCTTCCTCTTATAAGAACATTGGAGCCAGGCATGGTGGCTGACG- 

30 CCTGTAATCCCAGCACTTTGGGAGACCGAGGCAGGCGGATCACAAGGTCAGGAGT- 
TCGAGACCAGCCTGGCCAACACAGTGAAACCCCGTCTCTAGTAAAAAAATA- 
CAAAAAAGTAGCCGGGCATGGTGGCAGGTGCCTGTAATCCCAGCTAC7TGAGAGGCT- 
GAGGCAGGAGAATCGCTTGAACCTGGGAGGCAGAGCTTGCAGTGAGCCGAGATAGT. 
GCCAATGCAGTCCGGCCTGGGCGAAACAGCGAGACTCCGTCGCAAAAAAAAAAAA- 

35 . ATAATAATAAATAATAAATAAAAATAAAAATAAAATAAAAAAATAAAAATAATAAAATAA- 
ATAAAAATTATTTTGAGACAAAGTCTATTCTGTGGCAGAGGCTGGAATGCAGTGGCGT- 
GATCACAGCTTACTGCAGCTTCTACCTCCTGAGCTCAAGCGATCCTTCPACCTTGGCT- 
TCCTGAGTAGCTGGGACCTCAGGTGTACATTACCACGCTCAGCTAATTATTTATTTATT- 
TATTATATTTTTGTGACGGAGTTTCGCTCTTGTTGCCCGGGCTGGAGTGCAATGGTG^ 
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tatctcagctcactgcaacctctgcct 
tcctgagtagctgggattacaggtacatgccatcacgcccagcta^ 
tagtagagacggggtttcatcatattggtcaggctggtctcgaactcctgacctcag^ 
gtgatccacctgccttggcctcccaaagtgctgggattacaggcgtgaggcaccacg- 
CCCGGCAAl rTTTTTTTTCTTTTTTTn n icagacagagtcttgctctgtcacccaggo 

TGGAGTGCAGTAGCGTGATCTCGGTTTACTGCAACCTCCATCTCCCGGGTTGAAG- 
CGATTCTCCTTTCTCAGCCTCCCAAGTAGCTGGGACTACAGGTGCACACCACCACGG- 
CGGGCTAAI \ I \ IGTATTTTTAGTAGACACCAGGTTTCACCATATTGGTCAGACTGGTC- 
TCAAACTCCTGACCTCAGGTGATCCATOTGCCTCAGCCTCCCAAATTGCTGGGATTA- 
CAAGCGTGAGCCACAC ACCTGGCTTAA I II I I ITA7TTTTGATCGACACAGGGTCTCCC- 
TATGTTGTCCAAGCTGGCAGAGATTTTTGTTTGTTTC 

GTAGCCCAGGCTGGAGTACAATGGTGCAATCTTGGCTQACCACAACTTCCGCCTCC- 

CGGGTTTAACAGATTCTCCTGCCTCAGCCTCCCAAGTAGCTGGAACTACAGGCACC- 

TACCACCACACCAGGCTAATTTTTGTGCTTTTTAGTAGAGATGAGGTTTCACCATGTT- 

GGCCAGGCTGGTCTTAAACTCCTGGCCTCCAGTGATCCACCCGCCTTGACCTCC- 

CAAAGTGCTGAAATTACAGGCGTGAGCACCGCGCCTGGCCTCTCAACCTACAATTT- 

CAACACCCAAGGAAACAGCCCACCATGAGTGAGAACCAGCAGACACAACAAACTA- 

TAGGATTAGCTGCGTCCAAACTTCAGGTGATAGATTATCAGGCATGTACTTGAAAC- 

TAAAGGACACAAAAGAAGAATCqGAAATATAAAATAAAGGATTGGACTTGTGTGAAAA- 

GAATCCCTTAGAAAGGGCTACTTTCAGGCTGGCCATGGTGGCTAATGGCCTGTAATC- 

CCAGCACTTTGGAAGGCCGAGGTGTGTGGATCACCTGAGGTCAAGAGTTCAAGAC- 

CAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTGAAAATACAAAAATTAGCCAGGT. 

GGGGTGGCAGATGCCTGTAATCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATCGC- 

TrGAACTCAGGAGGCAGAGGTTGCAGTGAGCTGAGATTGCGCTATCGTGCCCCAGCC- 

TGGGCACTAGAGTGAGATCAAAAAAAAAAAAAAAAAAAGAAGAAGAAGAAGAAAGGGC- 

TACTTTCAGACTGCCTTGCCAAAAATCATAACCACAATGATGAGCATGTATTGAGT- 

CAAAACAGAATCAAAAGAGAAGAAAGTCAATTTCTGTGCAAACTACTTTTATTT^ 

GAAAGTTTCTCTATTTTGTTTATAAACATTAAACCAGTGCTGTGTGAAGGC 

GGGGAGAGGTGGGGCAGGGATCCTGGTAGAGACCAATGTTTCCCACCCAGACCC- 

CAAGACTGCTGGGAGAGATGGTGTCAGCAGTGACTCCCAGGAATATCCAGTGGTGTG- 

GTGGCCCATCCCAGGCCCGGCTGGGCAGGTGGCTGGCTTGCTGGGGGATGTGAT- 

GATGGTGGTAGGCATGGGAGGCAC7TTGGACGGGATCTGATTTGGCAAAAGGAAGTG' 

GTTTCCTGTCCCCAGTGATTTCCAGPCCTTCCCAGACCTCCCAAGGCTAAGGCAGAT- 

TACTAAATTTAAGGCTGGGGCCCTCCTTCTTCCCTGGACTTCCAGGAGAACAGAGAAC- 

CGGTGGCAAGGACCACCACCAGCAGGGTGAGGGGTGCAGATAAAGGCAGCAAAAAA- 

CAGAGGGAGAGGTCTGGAGGGAAGGCAGGAATGCTTGTTTCTGTCAGCCTCAGAAAC- 

CTCCTTCTATCCTGCTAGACTTTACTCCTTTGAGGCTTCACCCTGGGGAACAGCTGGG* 

GAGAGACAGGATCTTCAGACATCAGGAGCTCCCACCTCCTCATCCCACATGCAAATC- 

CGCTGCCTGTCTCTATCCTCCCACCCCTTCCTAAGGGGACCTCTCAGCACCTCC- 
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CAAACTGCTCCAGAATCCAAGTTCTGTGTCACCTCCAAGAACCAGATGGAACCTTCCA- 

ATCAGAGCCTCCACTGATGAAATGGAATATTTCCAGTGTCTCCTAACTGCCATAAGGA- 

GAAGCCCACCTCTCTCTAACACCTTGGTTGTCTT7TTGGGTCCCACCTCCATATT- 

TAAAAAATCTCCTCTCTCAGGGCCGGGAGCAGTGGGTCACACCTATAATCCCAGCAGT- 

TTGGGAGGCCGAGGTGGGTGGATGACCTGAGCTCAGGAGTTCAAGACAAGCCTGGT- 

CAACATGACGAGACCCTGTCTCTACTAAAAACACAAAAAATTAGCTGGGCGTGGTGGT- 

GCATG.CpCGTAATCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATCACTTGAATCCG- 

GGAGGTGGAGGCTGCAGTGAGCCAAGATCGCGCCACTGCACTCCAGCCTGGG- 

CGACGCAGCTGAAGCTGTGTCTCCAAAAACAAAACACACACACACACACACACA- 

GAAAAAAAAAACCAAAATAAAAAAATCTCCCTTCTCAGGAATGTAACGGAAT(jrTCCTT- 

GCCTTCTCCCCTAAGCCTAATAGAGAATTTTCCTCAGTTACACTGTMTTTTATTAATG* 

GAl l 1 1 ICCTCATTCTGCCCAATGCAGTGTAATGAAAGCTTCCTCTCCATCTGTTATAT- 

TATATATAAATATATATTATATATTTATATATTATATATTTATATATAACATATA^ 

TATTGTCACCCAGGCTGGAGTGCAGTGGCACCATCAGGGCTCACTGCAGGATCAATC- 

TCCCAGGCTTAAGCGATTCTCCTGTGTCAGCCTCCTGATGAGCTGGGATTACAGG- 

CACCCGCCACCACACCCGGCTAACI rTTTTTTm GTATTTTTAGTAGAGATGGAGTTT- 

CACCATGTTGGCCAGGCTGGTCTAGAACTCCTGACCTCAGGAGATCCGCCCGCCTT- 

GGCCTCCCAAAGTGCTGGGATTACAGGTGTGAGCCACCTGGCCGGGCCCTCCACT- 

TCCTTCTTGTACATTGCTGAATCCCTGTGTCAGCCCTAGAGGTCCAGTCTTTTGCCCTC- 

TCCCAGCCTTAATCTACAATTCTGTMCCCACCCACCATCATTAAAATGAGATTCTTC 

TTGTCGCTTCCCTTGGCTAAAATGGATTATTC 

GATGATAATAAAAACATTGGATTGAGCAGAAACCAATCAAATAACTAGTAAGGCAGTAC- 

TGGCGAGCACCCTACATCCTGACAGCTTTATAAAGGGCGCTTCCAGCCAGGTGCGGT- 

GGCACATGCCTGTAATCCCAGGACTTTGGGAGGCTGAGGCGGGCAGGTCACCTGAG- 

GTCAGGAGTTCAAGACCAGCCTGGCCAACGTGATGAAACCCTGTCTACACAAAATA- 

CAAAAAAAAAAAAAAAATTAGCCGTGCGTGGTGGCATGCGCCTGTCATCCCAGCTAC- 

TCTGGAGGCCAAGGAGGGAGGATCACTTGAGCCCGGGAGGCAGAGGTTGCAGTGAG- 

CCCACATCTTATCACTGCACTCCAGTCTGGGTGACAAAGCAAGACTCCATCTCAA- 

ATAAATAAATACAAATTGGCCGGGTGCGGTGGCTCATGCCTGTAATCCCAGCACTTTG. 

GGAGACCAAGGCAGGTGGATCATTTGAGGTCAGTAGATCAAAACCAGCCTGGCCAA- 

CATGGTGAAACCCCGTCTCTACTAAAAATACAAAAAGTAGCCGGGCGTGGTGGTGGT- 

GGGCGCCTGTAATCCCAGGCAGGAGAACTGGTTGAGCCCGGGTGGGGGGGGCC- 

CGAGGTTGCAGTGAGCACAGATGGCGCCATTGCACTCCAGCCTGGGCGACAGAG- 

CGAGACTCCGTTTCAGAAATAAATAAATAAAATAAAAATAAAAATAAAAAAATAATAGAA- 

ATTTAAAAATAAAATAAAGGGCTTTTCCTCACCfACTCCACTAACTATAAGSGACCCT- 

TACCCCCGACATTACTATTAAATAT^ 

ATGAGCi i \ \ CAGAGCTCCCTCTCCCAATATAACGGTTTGTTC 

TTCCTGTGGGATCCCCCTTTTCCCCAACCCCCAACTGTCGGGAGGTCCCCATGACTTC- 
TCCCCTGGGCTCACCCCGAAGTAGTTCCGCGGCACGTAGCCCTCCTGGCCGTGCAG- 
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CGCGGCCCACCACCAGTCGOTCTCCTCCGGCCCGTCCCTCCGCAGCACGGTGAC- 

CGACTCGCCCTCGCGGAAGGACAGCTCGTCCCCGAACTCGGCGCTGTAGTCCCAGA- 

GAGCGTACACTGCCCCGCTGTTCATCAGCCCCATACTCTGCTCGACGTCTGAAACAT- 

GCCACGGAGGGGAAGGTGAGAGCCTGGCCCAGGGGGTCCAGGAACAGGGGC- 

CACGTGGGGTCCAGGACAGACCCTGGAATTTGGCGCCTGTCCCAGCAACCACCTGAA- 

ATGTTGTGTGTGCCCATGGCTGTGGATGGGAACCGGAGCTGGAGTCAGATGCCGG- 

GACTGGCCGTCTTTGAGCGTTCGAQGAAACTGGGGGAGGCATGCCAGTGGGCCACC- 

CACTCCCGAGGCAGGQTCAGAGGCTCCC ATT T C 1 U IG I MUM 1 1 H 1 1 1 1 It I U GA- 

GACAGAGTCTCGCTCTGTCGCCCAGGCTGGAGTGCAGTGGCACGATCTCGGCTCAC- 

TGCAACCTCCGCCTCCCGGGTTCACACCATTCTCCTGCCTCAGCCTCCCGAGTAGCT- 

GGGACTACAGGCGCCCGCCACCACGCCTGGCTAATTTTTGGTATTTTTAGTAGAGT- 

CAGGGTTTGACCGTGTTAGCCAGGATGGTCTCGATCTCCTGACCTTGTGATCCGCC- 

CACATTGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGCCACCGCGCCCGGCC7TT- 

I I J I J NTI I I M 1 1 1 1 1 1 I GAGATGGAATTTCGCTCTTGTCGCCCAGGGAGGAGTGCA- 

ATGGTGCGGTCTCACTGCAACCTCCGCCTCCGGAGTTCGAGCCATTCTCCTGCCT- 

CAGCCTTCCAAGTAGCTGGGATTACAGGTGTGCGCCACCATGCCTGGCCAATTTTTG- 

TATCTTTAGTAGAGACGGGGTTTCACGATGTTGGTCAGGCTGGTATCAAACTCCTGAC- 

CTCAAGTGATCCACCCGCCTCGGCCTCCXDAAAGTGCTGGGATrACAGGCGTGAGC- 

CACCTGGCCCGGCCCTC^TTTCCTTCTTGTACATTGCTGAATGCCCGTGTCAACCCTA- 

GAGGTCCAGTCTTTTGCCCTACCCTGGCGCTTAGCTTAAGTGGTACAGTGTCTAAG- 

GAAGATTCGCACCTTCCTTGAATGATAGGGTCCTTTAAGTTGGCTCATCTGCCTCTTTC- 

TTTTCnTTTClTTTCTTTTCTTTTTGGAGAGGGAG 

GAGTGCAGTGGCGCGATTTCGGCTCACTGCAACCTCCGCXrrCCTGGGTTCCAGCAAT- 

TCTCCTGCCTCAGCCTCCAAAGTAGCTGGGACTACAGGCCCACGCCGCTACACCCGG- 

CTAAATTGTTTTATATTTTrAATAGAGACGGGGTTTCACCGTGTTGCCCAG 

GGAAATCCTGAGCTCATGCAATCCGGCCGCCTCQAGCCTCCCAAAGTGCTAGGATTAr 

CAGGCATGAGCCACCGCGCCTGGCTTTCTTTTTCTTTrCTTT^ 

CAAGGTCTCACTCTGCCACCCAGGCTGCGGGAGTQCAGTGGTGAGATCAAGCTTACT. 
GCAGCCTCGAACTTCCAGATTCAAGCAATCCTCCTGCCTCAGCCTGCTCCTGATTCTT- 
TATGTTATTATTAAATATTTTGTAGGCCGGGCACAG^ 

CACTTTGGGAGGCCAAGGCAGGCGGATCCTCTGAGGTCAGGGGTTTGAGACCAGCC- 

TGGCCAACATGGCAAAACCCCGTCTCTACTAAAAATACAAAAAAAAAAAAAAAAAAAGT* 

TAGCGGGCCGTGGGGCCCTTGCCTGTAATCCCAG7TACTCGGGAGCCTGAGGCAG- 

GAGAATCGCTTTCACCGAGGAGGCAGAGGTTGTAGTGGGCTATGGTGCGATTGCAC- 

TCCAGCCTGGGTGACAGaGCAAGAGTCTGTCTGAAAAAATAAATAAATAAAAATAA- 

ataaatatttcgtagaggtcaggtgtggtggctcacacctgaatcttagcactttgg* 
gaggccaaggtgggcagattgcctgagctcaagagttcgggaccagcctgggcaa- 
cactgcaaaaccccttctgtactaaaaatacaaaaaaatgagtcgggcatggtggt- 
gagcacctgtagtcccagctagtcaagaggctgaggcagagaattgcttgaatccag- 
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GAGGTGGAGGTTGCAGTOaOCCGAGATTGAGCCACTQCACTCCAGCCTGGGTGA' 
CAGTGAGACTCTGTCTCAAAAATMT^^ 

TACAATGTCTTGTAGCCTGACCAGGCTCACCTTTCAAATATATAACCCTCTGTCTCACC- 

CATAAGTCCTAGGACCTGCCTCACTCGAACTCTCCGTGAAGTTCCTTGCCCACACCGA- 

GATACAACTGGCTCCTCCAGGTGTGAAATGACCCTGTGCACAATCCCCGTG6CACAG- 

CCTACTTCGCCCTGCCCGTCGGGGAACCAGGTGATGTAGCCTGCCCCC3TGGAGAGA* 

TAGGGTACAGCCTTGTGTCTTCCTACAAGCCCCTTTCTrGGCAGCTGTAGCCTGCTCAC- 

CTGCCAGTGGTGTGGCAATGCCTCTCCCACAAGTGGCAGAGCCCACCTGCCCAGAG- 

CCCTATGCCAGGTAGATGGCAGGGTTGAAACGTTCAGCTCCTCACCCTTGAAGATGT- 

GAAAGGTGAGCAGACCAATCTTCACAGCCACTCTCCTCCCCAAAGGTGTCCAGCTCG- 

CATAGCACAGCCTCC^TGTCCCCTTTTCCCTTAGGAGGGCATAGTCCCCCCACCCC- 

CGCAAGCGGTCCATCCCTCATCCTCCTCCTCGGCAATCCTGCCAAGTGGTTGGTA- 

CAGCCCCCATACCCTTCTCTCCCTAGTAGGGGGTAGTTGCTCCCCTCCCCGCTCCTG- 

CGCACCCGCCAGGTACCCAGGCGCCAGCAGCCCTGCCTCGCACCTGCCAGGTAGGT- 

GGCGCAGTCAGCATAACCCTCGCGGTAAGGGTCGCACTTCTCGAAGGCGGTGGCGC- 

CGTCGCTGAGCGTGGTGGGGAAGATTGCAGCGCCGTGCTGCACCAGCGCCATGCA- 

GATGACTGTGTCGTTGCACGACGCCGCGCAGTGCAAGGGTGTCCTAGGCGTGGGG- 

GTGGGGGGTTGCGGGGAACGATGCGTGAGAGGCTGCGCGTCCGCCCACGGGGGAC- 

CCAGCCCACCGCGCGGGTCGGGGCTCACCAGCXGTGGCTGTCGGGGGAGTTGA- 

CATTGGCACCCGCGGTGATGAGGAAATCCACGATAGAGTAGTTGGCGCCGCAGAT- 

GGCGTTGTGCAAGGCAGTGATGCCCTCCTCGTTGGGCTGGCTCGGGTCGTTCATCT- 

GAGTGCACCGGGGGAGGGGGAAGACTCAGTCCCGCGGCTGGCATCTGCGATGCCC- 

CCGCCGTGCCCACCTCCCGCTCAGCAGCGCTCACCTCCTTCACCGGCTGCTGCAC- 

CACCTCCAGCTCCCCGGTCAGCGCCGCGTCCAGGAGGAGCACCAGAGGGTTGAGG- 

CGCGCGCGGGGGGCC7TGCGCGGGGAGCCCGCCTTCCGCAGCACAGAGCGCATC. 

TCCTGGGGGACAGGGCGCAGAGGTCAGCGACTTGGAGGGATTGTTAGTATATCCAT- 

GATCTAGAGTAGGAAACAGAGGTCCAGGGACTTGTGGCACCCATCTAGACAGGGGTA- 

GAACTGGGATTCCCTCGGGATGGGGTGAGGGGGTGCCTTCGATCTCCTCCTAGAGCC- 

TCCAGTTCCCTGCC^TAGACAGGGAATCCTGTGATTTGAGAATCTTGGGCGCTGAAA^ 

TTGGGAGAAAGCTGGGGGGCCATGGGATTGGTGGCAAAGTAATTCrrATCAGTT- 

CAAAACAATGATTGTGGAAGCCAGTTATGCAATTCACACACAGTCTCACATTTCTTTT- 

GTTAATAATGAATGCAATGAGACACACATGACAAAATGTTACCAGGAGTGTTCATTC- 
CGGATGm 

i 1 1 1 1 1 i M i 1 1 1 1 GAGATGGAGTCTCGCTCTGTC ACCCAGGCTGGAGTGCAGTG- 

CAGTGGTGTGATCTCAGCTCACTGCACCCTCCATCC6CCAGGTTCAAGCAATTCTCCT- 

GCCTCAGCCTCCTGAGTAGCTAGGATTACAGGCATGCGCCACTATGCCTGGCTAATTT- 

TCATATI 1 1 1 AGTAGAGACAGGGTTTTGTCATGTTGTCCAGGCTGGTCTCGAACTCCT* 

GACCTCAGGTGATCCACCCACCTCAGCCTCCCAAAGTGCTAGGATTACAGGTGTGAG- 

CCACTGTGGCCAGCCTCATGGGCTTTCTTATTTTTAATTTTCCTCCTGTAAG 
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TATTCTGQ6CTGGGCGAGGTGGCTCATGTCTGTAATCCTAGCACTTTGGGAGGCT- 
GAGGTGGGAGGATCACTTGAGCCCAGGAGTTCGAGAACAGCTTGGGCAATATAGTGA- 
GACCCAGTCTCTACAAAAAATAAAAAATTAGCCTGACATGGTGGCGCACACC- 
CGTCGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGGATTACTTGAATGGAAGAGAAG- 
5 GAGGCTTCAGTGAGOCATGATCATGCCACTGCACTCTAGCCTGGGCAACAGAGTGA- 
* GACCCAGTCTCAAAAGAAAAAAAAATGCATTTATTTATTCCAAGTGTGTGAGTGCATAG- 
CATTTGTGATTCTGGTCTTTGCTGTTrc 

CAGAGATCCCAACAGCCACTGAATTCAAAATTCCCAGATGCTCAGTTATTTCAAGTTTC- 
CAATATGTTGTGATTGCAGAAATGCTAGGCTGTGCTATTrCAAATTGCTGAGGGGG- 

10 CAGGACTTTGGAATCCAAAGATTCTATG^TC 

TCI lirn I IGTTGGTTfl 1 1 IGAGACAGAGTCTCGCTCTGTCGCCCAGGCTGGAGTG- 
CAGTGGTGCGATCTCAGCTCACTGCAAGCTCCGCCTCCCGGGTTCAGGCCATTCTCC- 
TGCCTCAGCCTGCCAAGTAGCTGGGACTACGGGCGCCCGCCACCACGCCTGGCTATT- 
TTGTATTTTTAGTAAAGATGGGGTTrCACCGTGTTAGCCAGGAAGGtCTTGTTCT 

15 . GACCTCGTGATCCGCCCACCTCGGCCTCCCAAAGTGCTGGGATTACAGGTGTGAGC- 
CATCATGCCTGACCTAGAATTTCATnTAAAAGACTAGAAGGAAATGGCTGGGTGCG- 
GTGiSCTCATGTGTGTAATCTCAGCACTTTGGGAGGCTGAGGAGAGTGGATCACCT- 
GAGGTCAGGCAGGAGTTCAAGACCAGCCTGGCCAACGTGGTGAAACCCTGTCTCTAC- 
TAAAAATACAAAAATTAGGTGGCCGTGGTGGTGCACGCCTGTAATCCCAGCTACTCAG- 

20 GAGGCCGTGGCATGAGAATCACTTGAACCCAGGAGGGACAGTTATAGTGAGCTGA- 
GATGGCACCATCGCACTCCAGCCTGGQTGACAGAGTGAGACTCCATCTCAAAAAAG- 
GAAAAAAAAAAGAAAGACTAGAAGGAAATATTCAAAATGTTAATGATGGTTCCCTGT- 
QAGTGGTGTGATTrTGTCCTCTTTCTrCT A l 1 I 1 J ATTTATTTTCCCCAAGCTCTCTATG- 
GTGTTGGTGTATTTCTCTATAGTGGAATGTGTAAATTTAAAGTATAAATCTCAGCTGGG- 

25 CACAGTGGCTCATGCCTGGTTTGAGACGAGCCTGGACAACATAATGAGAACTGTCTC- 
TACTGAAAATGTTAAATATTATCTGGGAGTGGTGGTGCATGCCTGTAGTCCCAGC- 
CATAGGGGAGGCTGAGGCATGAGGATCAATTGAGCCCAGTAGGTGGAGGCTGCAGT. 
GAGCCATGATCTTGCCaCTGCACTCCAGCCTGGGCAACAGAGTGAGACTCTGTC- 

tcgataataataaccctctattacaacatatcagtgcatgaatttgtgattttataatt- 

30 caamtatgagcatctttaattgtcagattre 

cagtctatgatactaactttataattaj 1 1 1 i 1 1 i aagag aagagtttccttttattt- 
tattttatttgagacagagtttctctctgttgcccaggctggagtgcagtg 
atctcggctcactgcagcctctgtctcctaggttcaagcaattctcctgcctgagco 
tcccgagtagctgggattacaggcatgcaccaccaggcccagctaatttttgtatttt- 

35 tagcagagacggggtttcaccatgttggcgaggctagtcttgaactcctgacct- 

caagtgatccacccgcctcggcctcccaaggtgctgggattacaggcatgagccac- 
cgtgcccagcctaactttataattctaa(3atcgtgttcaaacctttaaatgctctaggg- 
ctctaaaatgttactatcctaagacggtgacactagcgtttgattcttacattctatgat- 
tttttaagtttctctgtggccaggactctgtgattctacaatg 
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caacatgttgttattcatcgcctcttgatttcaaaa 

ctttactttcaggagggcctaggaataggcattttgggggggtccacctgacccctg- 
cttctctgagaagtgatctcttcccgctgtct acgcacacgg agtgttcaggactgt- 
tccatgtggctacaaccctcttcccagtcaagatgcagggaccaagatcagcagga- 
5 gaccatcccctggtccaatggtgacaacagtaagagcagttaacagttatgtgccagg- 

TATTATGCTAAGCACTACATTAATGTAT7TAATCTTGGCGGGGTGTGGTGGCTCACACC- 
TGTAATCCCAGCACTTTGGGAGGCCAGGGGGGGCAGATCACTTGAGGTCAGGAGTT- 
CAAGACCAGCCTAGCCAACACAGTGAAACCCCATGTCTACTAAAAATACAAAAATTAG- 
CCAAGCGTGGTGGCATATGCCTGTAATCCCAGCCACTTGGGAGACTGACGCAGGAGA- 
1 0 ATCACTTTAACCCAGGAGGTGGAGTCCAGCACCCAGCCGAGACT 

T ATTTATTTATTTA I i I f T Ai ITTTATTTTTn 1 G AGACGGAATCTTGCTCTGTCACC- 

CAGGCTGGAGTGCAGTGGCQCGATCTCAGCTCACCACAAGCTCCGCCTCCCGQGCT- 

CACGCCATTCTCCTCTCAGCCTCCAGAGTAGCTG 

CCCCAGCTAATTTTTGT A I \ \ \ \ AGTAGAGACGGGGTTTCACCGTQrTTAGCCAGGATG- 
15 GTCTTATCTCCTGACTTCGTGATCCGCCCGCCTCGGCCTCCCAAAATGCTGGGATTA- 
CAGGCATGAACCACCACGCCCGGCCTATTTATTTATTTATTTAGAGATGGAGTCTTG^ 
TCTGTCGCCCAGGCTGGAGTGGAGTGGTGCAGTCTTGGCTCACTGCAACCTCCGCCT- 
TCCGGGTTTAAGCGATTCTCTTGCCTCAGCCTCCTGAGTAGCTGGGATTGGAATGA- 
GACCACCACTrCTCCTGTTGTCCTTCCCAGCTTCTCCCCCACCTCCCCTTT^ 
20 TTATAAGAC AGG AAAAAAAGG G AGAAAGCAAAACGCTG GAAAAAAACAGAAGTACGA- 
TAAATAGCTAGATGACCTTGGCGCCACCATCTGGTCCTGGTGGTTAAAATAATAATA- 
ATAATATTAATCCCTGACCAAAACTACTGGTGTTATGTGTAAATTCCAGACATTGTAT- 
GAGAAAGCACTGTAAAACGTTTTGTTCTGTTAGCTGATGTCTGTAGCCCCCAGT- 
CACGTTCCTCACGCTTACnTGATCTATCGTGGCCCTTTOACGTGGACCCCTTAGCGT^ 
25 GTAAGCCCTTAAAAGTGCTAGGAAi ilClll! ! CGGGGAGCTCGGCTCTTAAGACGCT- 
GATGCTCCCGGCCGAATAAAAACCTCTTCCTTCTTTMTCCGGTGTCTC 
GTCTGTGGCTCGTCCTGCTACAGAATTACAGGCACGCGCCACCGCTCCGGGCTAATT- 

tttgtatttttttagtagac^^ 

tctgacctcatgatccacccacctcggcctc.ccaaagtgctgggattagaggcgt- 
30 gagccaccgcgcccggccgagactcactattttataagaggagagagcaaagccag- 
gaacagtggctcatgcctctaactgcagcaatttgggaggctgaggcaggtggat- 
catttgaagtcaggagtttgagaccagcctggccagcatggtgaaacctcatctctac- 
.taaaaatacaaaaattagccaggagtggtggcatacacttataatcccagctacttgg- 
gaagctaaagcgggaggatggcttgaacctgggaggcggaggttgcagtgagc- 
35 cgaggtcaagccactgcactccagcctgagtgatggagcaAgactctgcctg- 

gaaaaaaaaaaaaaatagaggagagagcagagcagacacaagagacacagagacaga- 
gagggagagaagagagggtgactgctttgattcaggcaagacttctcagtcccagar 
atgaacccactg7tgtgccaagactcagtcatgtccaggtgtatgactcgagattgct- 
gaaggaatgcccggggcagggcacaggcacaggttattggagagaaggagcaga- 
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GAACATCTCTAT0T6GCCAAGACTCCCAGATGGCCCTCCATATAGTCACACACAGC- 
TATCCTAAAGACTACATTTCCCAGCATCCCATTGCAATGAGGCTCCTGGCCAGTGG- 
GAGCAGGCAGAGTGATGTATGGAACTCCCAGGTTCTGCCTGAAACAGGAAAGGGCAC- 
TTTCTCTTCTTCTTTCTCTCTTCCrrGGCTGGAG 

GACCATGAAGGCAGGCTTACTCCCCGATGGATGGCAGAGCCCCAGGTAGATAGAGCC- 

TGGGTCCTGACTCCAGTGAGGTGCCTACAGTCCTGGGCTGCAAACTCTTGGACTTC- 

TACTCAAAAGAGGAGAAAACTTCGATCTCATCTAAGCCACTATATTTGGGGGGCTCT- 

TTGCTACAGCTCCTGGATTCATGTAGCAAACATACCCCGGTTTCCTCCTGTATTACT- 

TACCATGCTCTGCGGCTGCTCTGGTGGGCTGCTCTGGGACGGGGCCGGGGGTGGA- 

ATGGGAGCTGGTGGGGCAGGAGCAGGGGGCCCTGCCCTGGCCTCAGATCCCTCAGT- 

GATGGGGGACAGCTCTGGCTCCGGCCCCCCGGGCCCTGGCCCCCCATGACGATG- 

CAGCAGGGGCCTGCTCCATGGAGCCCCTGCGTTTGAGGGGCCGGGGAATTTCCGC- 

CAACACCCGTGCCACCTCCTCCAGCTCGGGCACCGACTGTGCCTCCGGTGGCAGTG- 

CTGGCTGCAGCCTCGTGGGGCTGAGAGGCCTTGCTACAGGGCC7TCATCCACATCG- 

CCAGCCTCCAGCACTGGTGTCAGCAGCCCCTCTATCTCCGGCTCAGGCTCCAGCTCG- 

GTGGGGGGTTTpGGGGGTCCTAGCCGGAACAAGAGCCCATCAGAGGACAGGTCCC- 

CAGGAGACACCCAACACTCCCTCTCCACAACTTCCAQGGCATACAACCAGCACATGAT- 

TTTCTGTGTGACCTGAGGGAAGTTCCTTGCCCTCTCTGGGCTACACTTTCCTTGGGCT- 

GTGAATAATATACAATTATGATGCCTC 

CTTTACATGCCTACCTTATTCTAATCTCACCACTGCTTTGTGAGGTA^ 

CATCTCCACATTACCGAAAGGGAATCTGGGCCTCAGAGAGGACAAGTCAGTTGCC- 

CAAAGCCATGCAGTTGGGACTTGAACTCAGTTCTGGCTGACTCTAGAATCTACTTC- 

TACCAACCGTGATAGATGTGATTTTCTGAGATCCTGAGAGTTTCCTCTCCTAACATCT- 

CAGGCAGAAAACTCCAGCAGGAAGTAGAATCCTG 

TACTCATTCTGACAGTAAAGCAGGTGGAAATAAAAATATGCATTATTGGCT- 

GAGTCGAGTGGCTCACACCTGTAATCCCAGAACTTTGGGAGGCCGAGGCAGGCA- 

GATCTCTTGAGATCAGGAGTTTGAGACCAGCCTGGCCAACATGGitAAAACCCTGTCTC- 
TACTAAAAATAr 

CAA AAAAAAAAAAAAAAAAAAAAAAATTAGCTGGGCGTGGTGGCAGATGCCTGTAATC- 
CCAGCTACTCGGAAGGCTGAGGCACAGGAATCGCTTGAACCCAGGAGGCGGAGGTT. 
GCAGTGAGCCGAGATTGCACCACTGCACCACTGCACTCCAGCCTGGGCAAAAGAGT- 
GAGATTTCATCTCAAAATATATATA^ 

TATATATAGTGTATATATATTTTrATATAGTATGCATATAC^TATAAA 

cacacacacggctgagcatggtggctcatgcctgtaatcccagcactttgggaggct- 

gaggtgggtggatcacctgaggtcaggggttcgagaccagcctggccaacatgg- 

caaaacctcatctctactaaaaacacaaaaaattagttgggtgtggtggtgcatgcct- 

gtaaccccagctacttgggaagctgaggtaggagaatcgcttgaacctgggaggtg- 

taggatgcagtgagctgaaacctcaccactgcattccagcctgggcaagaagagt- 
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GAAACTCCATCTTGQCTGGGC^CGGTGGTTCACGCCTGTAATCCCAGCACTTTGG* 
GAGGCCGAGGTGGGCAGATCATGAGGTCAGGAGATCGAGACCATCCTGGCTAACAT- 
GATGAAACCCCGTCrrCTACTAAAAATACAAAAATTAGCTGGGGGTGGTGGTGGGCGC- 
CTGTAGTCCCAGCCACTCGGGAGGCT6AG6CAGGAGAATGGCGTGAACCCGGGAGG- 
5 CGGAGCTTGCAGTGAGCAAGCACCACTGCACTCCAACCTGGAAGAAAGAGCGAGAC- 

tctgtctcaaaaaaaaagagtgaaactctgtctcaaaaataaataaataaataaaccc- 
caaaacacacacacatacacattatttcattgaatccccgtcacaattctatagggta- 
gatattattaatctctcttc^^ 

cagtcacacagcaagttagcagtgaagagagactccagcccatctgcttaactcact- 
10 gatctcacacctcaaaatattaataaattattataactaatatggtagctatttam 

gactgggtctcactctgtcacccaggctggagtgcagtggcgctatcacagctcact- 
gcagcctggatctcccaggcttaaatgatcctcccacctcaggatcctgagtagctg- 
ggjactac aggcgcccact accatgcccggcag attttrtgtacttttatttttag- 
taaagtctattttagtttcacta^ 

1 5 CAATCCTGTCTGCATTAGCCC ACCAAACTGCT AGGATT ACAAGGGTGAGCCACGGTG- 
CCTGGCTAATATGGTAGCTATTGATAGCnTACTATO 

TATTTTTGAGAGAGAGTCTCACCCTGTCACCTGTGCTGGAGTGCAGTGGCATGATCTT- 
GGCTCACTGCCACCTCCGCCTCCTTGGCTCAAGCTGAGTAGCTAGGACTACAGTGGT- 
GAGCCACCATGCCCAGCTAA I I 1 1 1 I I M 1 1 1 J 1 1 1 1 II 1 1 1 I GAT AGAG ATGGGATTT- 

20 CATCATGTTGTCCAGGCTGGTCTTGAACTCCTGACCTCAAGTGATCTGCCCACCTCGG- 
CCTCCCAAAGTGCTGGGATTACAGGTGTGAGCAACTGCACCTGGCCCATCAGGTGCT- 
GTTTTAAAGGCTTTATATGAATTTAATAACATATGTCAATAGGATCGAT^^ 
TTGCCI I 1 1 I I I I 1 1 1 1 1 1 1 1 1 1 I I IGAGGCAGAGTCTCCCCGTCACCGAGGATGGACT- 
GCAGTGGCGCAATCTCGGCTCACTGCAACCTCCACCTCCCGGGTCCAAGTGATTCTC- 

25 CTGCCTCAGCCTCCGAAGTAGCTGGGACTACAGGCGCCCGCCACCATGCCTGGCTA- 
Am 1 1 lGTATTTTTAGTAGAGATGGGGTTTCATATTGGCCAGGCTGGTCTCGAACTTCT- 
GACTTTGTGATCCGCCCGCCTCGGCCTCCCAAAGTGCTGGGATTACAGGCATGAGC- 
CACCGTGCCCGGCCCATTATTTCCCTTTTACACTCAAGAAAATTGAGGCCCAGTGAG«» 
GTTAAGTGACTTGCCCAAGGTCACACAGCGTGGAACCAGGCAGTCTGGCTTCAGG- 

30 GTCCACACTTAACCTTTGAGCTATCCCTGGCTCCTACCCAAATTCCCAAACTCACCT- 
GGCCTAGCTCTCTGCAGGGACAGTGCTTGTAAAGAGGCATTTGGCTGTGATCTCCC- 
CACCTCCCAGGGCTGGTCTGGTCCCCCTGCCATTTGTCCTCCCTTCACCCAGTCCTC- 
TAGGGCCCTCATTGCTGACTCACCTTCGTTCACAGGGGCCATGTCTGTTGGGGATGC- 
TGGGGGGCTGGGGTAGGGGTTTGGGGTTGGGTCTGGGGCTGTGGGGGCAGCTGG- 

35 GGCTGTGGTTGTGATTGTGGCTGGGGCTGTGGTTGTGG7TGGGGCTGCAGCTTAGG- 
CGGGGGTGCTCGGGTGAAGAGGGGGGACCCAGGGAGCATGGCGCGGCTGGCCC- 
.CGTGCTCCCAGAAGGCGTTCTGCAGCTTGAAGATCATGCTGAGGGGGATGGGACGC- 
TGGCGCGGGGCCCCGCGGGGCTGGGGGCTGGAGGGGGGCATGGGGATGCGGCT- 
GACGGGCTGCCAGCTGCGAGGCAAAGTGCCCGACGGCCCCGCGGAGCCCAGCGAG- 



74 



HO I BERG A/S J^^^f 



29/04 2003 12:39 FAX 333^^^^, HO I BERG A/S ■ %%gj^ 

P687 DK03 

72 



C0CC60TA6CTGCCCGCGTCTGAACGCC66TCGCTGGCCAGAGGAGAQACCTT0TA- 
ATTGCGCGGCAGGGTGGCGCTAGTGAGGTTGTCCTGGGGAAGAGGGAAGGGA- 
GAAGGGGATCGGGTGAGAGAGGGAAGGTGGAGGGGAGGTAAAGACAAAAGACGA- 
GAAGG G AGAGGAGGTGAG GG AAGCCCTG GGAGTGAGGGAGAAGAAAGGGTG AG- 

5 gaaggagcagaaacccagcacagtgaagggagagcgtgggaacgggcgccgagac- 
cx:agatcgcagccccgagggggagactggccttgaccccgctcccccaccccactc- 
ctcgaccttccccagcctctcctccccaggcgtcgcctcctcaccttgccggtgccc- 
cccagtccatccaggctgctctccctccaaggcaacagctgcaggctcggcgagg- 
caggccttgcgaagacgtccaggcctgcggggcgggaatcattagggtctgtgggg- 

1 0 ctgcctctcctccgggtcctccattccccgggcctccaccactcacgttcatagc- 
tcgctgtctgcgaaggcttcttctcgtacgcx;acgtcca6gtcagactc.gttccagg- 
ctttcggaggccgccggcgcagcgtcaggtcgtctggggagaagtttccagggag- 
gatgagacgggaggggtggcgagccccggatcctgcccgctttgaccccgcgagt- 
caaaggccccgcgaggggcccctgggttcaccttgcgcgcgcagaggcggggcga- 

15 atgcgctgccgccggagcctagcagggagctcccgaaggcggacgctggcg- 

cgtcgtaggctgtggcaggggggcgcggtgacggcccacgctcggggaagaaggo- 
ctggggcccctccgccagggggctgccgcggggggagcctgcgcggcccag- 
gaagtcgaaaggcgtggggggaccctgctggcggagcgggcctggcccgggccg- 
cggggagggcgcacggccgagggagctgcctgcgccatcgaaggcgcggggccg- 

20 gggcgaggtcgcgcggtccaggctgccgtaggcgtccggctgcaggtagagcggg- 
gtgcgcggcgacgacggccgtcccttgggggacagcgggctgtaggggtgtagg- 
gttggggcactctctgatcgtccgaaqggggtgtctgcgccgtcggtggccgcct- 
tccggggggaccctcggctgccgaagggctcagggatcgagctggagctgtaccg- 
gggcggctgtggggaggccaggGcattgagggatggatcaaaggagacattagtg- 

25 gaagggttggtgtgtgggcgggggtgtcaagagagatcactggaggtcaaccca- 
gaggaggctgaccggccatggaaattcaggcacagagagcccaggtgagtagtggt- 
ggggagacagccctgaatcagcactgtggctagcccattactctatgtcacctttatg- 
ccacttaggtaaacacctctttccttctgagggtcc 
gtcccctcttrrctatttctttct 1 1 c 1 1 1 c 1 1 1 ctctctct7tcttttctttctttctt- 

30 TCCTCTCTCTCCTTCCTtt^ 

TTGCJrrGCTTTCTCTCTCTCTCTTTCTTTCTTT 

tcttttctatctcggcrcattgcagcctcaacctccctggcrrrag 
ttcagcctcccaagtagctgggattacaggtatgcaccaccacacctggctaactttt- 
gtatttttagtagagacagggtttcaccatgttagccaggctc 
35 gacctcaagtgatccgcctgtctctgaaagtgttgagattacaggcgtgaaccac- 
cgtgcccagccagatttttaaaaaatcatttgtagaggctggtctcaaactcttagtct 
caagcaattctctcacctcgccttccaaagtgctgggattccagqtctgagccatcg- 
cgcctggcctggtccccttttttcaagttcccttgaagagcccacaacctgcataac- 
tatAtggggcaattttgcctgaaatccaggcctctggtctggactgtggcgagaggc- 
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TQGCTTTGGAQATCAAGGTQ6GAACCAGGCTTACCCTAGAAGGGGGTCCGGCCTG- 

CGGGCCAGGAGGCGCGGGAGAGTCTG^CCACAGC^CTCCAGCTGCTTGGTCAGTT- 

CATCCACCTTGGCCGCCGCCGTGTCCAGCTCCATCTGCTTCAGATCCATGTGTTTCAT- 

GGCCAGCGCTGGGAAGGTGGGAGTGGAGGTAAGGACCTGGCCTCCTGGCAGGGGC- 

CGGCCTCAGCACCCCTCGCCCGCTGCCGAGGTCCCCGCCTCGCCAGCCCCGCCCCC- 

TACTCCAGCTTACACTGGAAGTTCATGTCX^AGAAAGTCCCGCGCGCTCTGGAATGCC- 

TCGCTGTCCATGGTGCCGGCCGGAGCGGGCGCCTGCATGGTGGGGAGGGAGGGAG- 

CTGGCTAAGACCCCGCCCCTCTAGACCCCGCCCTCAGGGAGTCAGACGCCGTGAG- 

GAGCGGGACAACGCCTCAACTCAGTTC.CTTCCCCTGGAAGCCCTTTACCCTTTCACC- 

TCCCCAGCTGGGAAATGCCAACTCCTCCAAAGCCAAGTCCATGCGCCACGGA- 

GAAGTCCAAACCCAGTCTAAAACCTCCGGAATC 

J I 1 1 II 1 1 1 1 II I I 1 1 GTGTATGTGTGTGAGACAGAGTCTCGCTCTGTCGCCCAGGCGG- 
GAGTGCAATGACGCGATCTTGGCTCACTGCAACCTCCGCCtCCCGGGTTCAAGCAA- 

atcttctccctagctgggactacaagcgcgcgccattatgcccggctaatttt^ 

tagttctgggattacaggagtgagtctccgcgcccggccgtgtccatctctttatct- 

cagtcctaagacctgwcactccttgmcaattatctattgatcacctacaatgtgc- 

cggtaaacataggatggaataactatgaattactgaatgtttactagggaccaggacg- 

cactgtgctagatcctgtttttgtttgtttttgagatggtgt^^ 

caggctggagtgcagtggcgcgatctcggctcactgcaagctccgcctccagggtt- 

catgccagtctcctgtctgagcctcccgagtagctgggactacaggcgcctgccac- 

catgcctggctaaai i ittgtatttttagtagagacggggtttcaccgtgtoagccag. 

gatggtctcgatctcctgaccgcgtgatc 

gattacaggcgtgagccacxgcgcccggcccttgtttttgttttttaataataa^ 

gctgtctgctgtgtactagaacccatgcdtactgcttggggtataatgtagtaaatg- 

tagtaaaaacaatatccgccgggcgcggtggctcacgcctgtaattccagcactttg- 

ggaggccaaggagggcggatcacgaggtcaggagagcgagaccatcctggctaa- 

catggtgaaaccccgtctctactaaaaataccaaaaattagccaggc6tggtgatg- 

gacgcctgtagtcccagctactcgggaggctgaggcaggagaacggcgtgaacccg- 

GGAGGTGGAGCTTGAACTGAGCGGAGATCGCGCCACTGCACTCCAGCCTGGGCGA- 

CAGTGCGAGACTCCGTCTTAAAACAAACAAATAAATAAATATGTTTAAAACAACAACAA- 

CAATAACCAGCCAGGCGCGGTGGTTCACTCCTGTAACCCGAGCACTTTGGGAGGC- 

CGAGGTGGATGGATCGCTTGAAGCCAGGAGACCAGCCTGGGCAATATGGTGAAACC- 

CCGTCTCTACAAAAAAATACAAAAGTTAGCTGGGCATGGTGGCATGTGCCTGTAATCC- 

CAGCTACTCAGGAGGCTGAGGCACAAGGCTCACTTGAACCTGGGAGGCACAGGTTG- 

CAGTGAGCATAGATTGTGTCACTGCACTGCAGCTTGGGTGACAGAGCGAGGCTCTAT. 

TTAAAAAAAAAAAAATTAATTGAGGGGCCACTCCCTTCTAGAGTGGTGAG/y^TGC- 

CGTG.CACCGAAAGCTTCATTTGATGGTCAAAACCACCCTAGCAGGCAAGAAAGCAT- 

GGCTCAGAAACATATGTTCAAGGTCACCCTGCAAGAAGTCGGTAGTMTCGGTTTCA- 

CACCCGCATCTAACTTATTCTGGGTCATCTCTACCAGATTAGAGGGGTCCTAGAGG- 
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gaagcgactgctcagcttcctttccctagggtccccattcagtggaggtctgqctct* 

CACTGACCCATTGTTAGCAAGAGGAACAGGGAGGTGeCCAGGGGTGGAGGGGCAGC- 

TGTGGTCACTGGCCCAGTGGGAGGGAGCTAGGCCACTAGGAACCGGTCAGGCCAG- 

CACCATCCCTATCCCCATGCTAGCGACCACACCCACCAGCTCTGCCACCTCCCTGCT- 

GGATCGACCACTTAGCTCTGGCAGTATAGGCAGCAGGGCAGGCTGGGGCATGCTGA- 

TACCCGCCTCTGTCTGQGAAGTCGAAGGAACAGAACCTGTTCAGGCTGGCGGCTCAT- 

rTGGATGAACAGGGAGTGTGTGACCTTGGGCGTTGAGTCCTCTCCACTCCCTGGGCC- 

TCAGTCTCCCCAACATCAAAGAAGAAGGCAAATCAC Cl 1 1 1 1 I I 1 1 1 1 i I 1 IG AGATAG- 

GGTCTCGCTCTGTAACCCAGGCTACAATTGTGACTCACTACAGCCTCTTGACCTCC- 

CAGCTCAAGTGGTCCTCCCACCTCAGCCTCCTGAGTAGCTGAGACT 

TCGCACCACCACACCCAGCTA A \ I M N » I I I 1 1 II I M M M 1 N I U ) I H I 1 1 GA> 

GACGGAGTCTTGCTCTGTCGCCCAGGCTGGAGTTCAGTGGCGGGATCTCGGCTCAC- 

TGCAAGCTCCGCCTCCCGGGTTCACGCCATTCTCCCGCCTCAGCCTCCCAAGTAGCT- 

GGGACTACAGGCGCCCGCCACTACGCCCGGCTAATTTTTGTATTTTAGTAGAGACG 

GGTTTCACCATTTrAGCCGGGATGGTCTCGATCTCCTGACCTCATGATCCGCCCGCC- 

TCGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGCCACCGCGCCCGGCCACCCAG- 

CTAATTTTTrAAAAACATTTTGTACACTTTGGG 

GTCAGGAGCTCGAGACCATCCTGGCTAACACAGGTGAAACCCTGTCTCTACTAAAAA- 

ATACAAAAAAATTAGCTGGGCGTGGTGGCGGGCGCCTGTAGTCCCAGCTACTCGG- 

GAGGCTGAGGCAGGAGAATGGTGTGAACCAGGGAGGCGGAGCTTTCAGTGAGCCGA- 

GATCGCGCCACTGCACTCCAGCCTCGGAGACAGAGCGAGACTCCGTCC- 

CAAAAAAAAAAAAAAAAAAAA7TTGTAGAGACAGATCAAGTCTCACTTTGTTGCTCAGG- 

CTGGTTTTGAACTCCTGGGCTCAAGCAATCCTCCCGCCTCAGCCTCCCAAAGTGCT- 

GAGATTACAGGCATGAGCCACCACACCTGGCCAAATCAGCTATTCTGAAAGGCCCCTT- 

TAATCTCTATGAGCCCCAGACTTTCAAACTGTAAGGACCTTAGGACTGTAACTAAAGT- 

TCTACAGAGCCTAAACCCCTCAGCTAAAGAGCCTATTGTTGGAAAGTTCTGAGTC 

GATTCTATCTTTGGAACATTCTAGAATTCTCCAATTTGTCTAACCCAGAATTCTGAGTCT- 

TTCTGTACCACATTCTACCTAACCCAGGGTTGCACTGCTCTGGAAGTCTAGATGGATG- 

GTATAGTGCAGCTGGTAAAAGCATGAGTAAGAAGTCAGACTTCAAAAATTCAAATCT- 

GAGGGCCGGGCATGGTAGOrTCTGCGTGTAATCCTTGCACTTrGGGAGGCCGAGGG- 

GGGAGGATCACTTGAGGCGAGGAGTTCAAGACCAACATGGCCAACACAATGAGACCC- 

CATTTCTTAAAAAAAATTAAAATAAAATCATCAAATCTGGCAGCACCACCGTCCAACCC- 

TGACCACAGTACCTCAGTCTCGTAATCCGTAAAATGGGGATGAAAGTTCACCTQATAG- 

GACTACTGTAAGAATCCACCTGGTCAGAAGGTGCAGGAAGAATTCAGAGCTCTGAGA- 

ATTGAGGCCTCAGGAAGAAGAGACTACAGGAATAAAAACTCGGGCATTTAGAATTTCA- 

GAGATACACAAACAATACTTTGTTAACTGTTAAAATAGATAAATGAGCAAGTCTGTG- 

CAGCCCTAATGCGAGCTGTAAGTGACT Ci rTTTTTTTCTn I G GTA<3AGATTTAGTCTC- 

TCTCGCGt^TGTGGTTAGGCTGGTCTCGAACTCCTAGCCTCATGGGATCCTCCCCGG- 

CTCGATCTCCCAAAGTATTGGGATTACAGGCGTGAGCaCGGCGCCATGATCCCGAA- " 
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ATTTCCAAGATTCTCAGATTCCATACTGACATTCTCTGGCTCTCAGGAAATGCCAACC^ 
TGGGTGTGGGGCTGTCGCGQGGACAGGCGGTGGGGACGTCGGAGCCACCAGGGGG- 
CGGTCACGCCCGGACCCCCGCCAGGAGGGCGGACTGCGCCTGAGCTCAGGCCCGG- 
GGAATGCGCAGCGGGCCCGGGCAGGTGCTGTACATCCCGGGGCAAGGGAGCTGGG- 
5 CCGGGCGGGGTACAAGGGCGGGGCGCGGGGGTGGCGCGGGCCGTGTGTCTGTTCC- 
CAGGCCTCTGCCCCTGACCTCTGCCTCCGAGTCCTCTCCCATGTGCTCCCCTCTAGC- 
TCTAGCTCCGAGCTCTCCCGCGGGCTCTGGGCCAGCCGCAGGTACTCTCCCCTGGG- 
CTCCTCTCTCCGCTCCACCCCTGGCTCTCCTTCCCTGGCCTCCTCTGCACCCCAGC- 
CAGGTTCTTTAGGGCTAAGGATCCTGTGGACTTCCTGGAGGAGTCATCTTCAGTAG- 

10 GAACCGGGTCAGAGAGCCAGACTGAGCTGGGAACACCCAGGCTGGACTCCTACAGC- 
CCTGTCGGGTCACACTGAATCTGGAGAGGCTCCACTGTCTCTGGGACTCGGTTTCC- 
TCCTTTGTGGACGTCTATGGAATGGGCTAGGGCCTTTCTTGCTCTAAGCCTCTACTT 
GGCTTGTTATTTAGCTTCTCTGTGCCTGTTTCCTCATGTGGACCATGGGAAGAATTA- 
ATACCTTCGCCTCAAAGGGGTATGAGGATTGAGTGACATAATTTATAAGCCGTGATTA- 

1 5 GAACAATGCAGTGCGCGAAATAAAGTTCACACATACAGGATTCATAATTACCAGAT- 

GTCCTTGGCTGTTCATTATAATAACACAGGGTCTGGCAACAGAGTGAGGGGTCCAGAC- 
TCAATGTAAI I J 1 1 1 1 1 I CCCCTAAAAGGGCCCTTTCAACTCTTTCTGAGATCATACAAG- 
CCCTGAGTTTTGACACCCAGGGTCTCAACTTCCTGAGCCCTTGCCTCTCAGAGTCC- 
TAAATTTCCCCTGTACATTCCTGAGTCTGGCCAGTGATCACCCTCAGTCACTTAGG- 

20 GACGGGAGGGCTGGGAGAGCCCTGGAAGATTCCAGACAGAAGCTGGCAAAAGCC- 
CAGGGTGTGGGCAATATCCACTCTCCAGCCTCCGTTTCTCCACTCGTAATGAG- 
GAGTCCTTCCCTGGGGTCAGCAAACCTTA7TCAAAGGGAGACCTCTCAGTCACCCAA- 
GATTCCTCTAGACAATGCGAGCTTTCCTACCTACCTACCTACCAGCTCTGAGCTTGG. 
TACACCCAGAGCCCTG I 1 1 1 G GCAACCACGQTTATT A I I 1 I I AATTTOATTTCAGGT- 

25 TATCATCAAATGCCCTTCAAGCCCAGACATTGGGAAACACTCCTCTCTCATCAGATGC- 
TCGCCTCCCCCATTCTGTTTTTAATCCCCCTTCrTAGGACGC^TG^ 
GAACGGGGAGATAGACAGAGGGAGGTGCCTGGTCCTGCCCTCCCCCCGCCTCAAG- 
GACAGACAGACACCTCCAGAATTAGCCTCTGTCCCTCCTTATCTCCCACAATACCC- 
CAGGTCAGACAGATGGGCGTGGAGGTGACATTTCTCACCTCAGGGTCAGGGCAAG- 

30 GAGCCCTGAGGCAGAAGGTTAGTCAGAAAATCTGGCGGGGGCGGATGGAATCC- 

CGTCCCCCAGAGAGCTGCAGAAGAAGGAGGAGGCAGAATCCTGACCCTAGAAACTC- 
TACTGCCTGTGTGAGCTCCAAGCCTCAGTTTACXCCTTCCTCrrCCGTGTAATGGTTAA- 
ATGCCCGGCTATGCAAACCTCCCAGAATCCAATAGCCGCTTTCCGGAATTCTGCCCT- 
GGGTTCTAGAACTACCTCTGCAAACCCAGCTGTTTCCCACCCCATAAGGCAATAGGG- 

35 GAGCCCACCTCCGCCAGGGGGTGCCCTAGGGCGGATGTCCCTTCTCTGGTTAGG- 
CAGGTCTGACGCCCAGGTTAATGACATGTTGGGTTCGCTCAGCGGCACAGAGGAG- 

gttggagatctgcctcggtgttttctctCctaccccgcccccatccccgagc- 

CGAAAAGTCGGGGGAGAGCCGGGACACAGCCTCCGGAGGGACCCCGGGTACCT- 
GTCCTGCTCCACTTCAGGAACCQAGGCTCCACTATCCCTGCCCCACCCTTAATT 
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tcagagacctagaagatcggtcgagacagcagcttgaggctggcagggtggtcacc- 

cattccaccttgagccccaccagtctgagcctctcattc 

tcgaacccctatactaccgaaagactcgggttcctagagccccccagttcgagggac- 

tcaggaattccagctccaacgtctccccgggatgaaggggtagaatecctccattc- 

caagaattcaggcatccgaacccgctttccttccctccagtaaaacaggcaacggagt- 

TTCCTTCTAAGGATCCAGGTGTCGGCGCGCCCCAAATrCCG'CCCTGGGACCTGG* 

CGTCCGAGTCCCCTCCCAATCCTCCCAGGGACGCGGGTGTTGGGCTTnTCAGGGCG- 

TCTGGTCCCCAGGAGGGTGAAACTCACGGATCCGGGCAGATCCTGGCACCTGGGGG- 

CTTCCTCCAGCTCGGGCTCCGGCTTGGGGAGCGGAGAACGGGGCGGGGCAGGAGC- 

TGGGAACAGGTTAGACGACGTGACTTGGGCTGGAGGGAGGCGGGTCCCGGTGGG- 

GAGGGGGAGCCAAGGTCGCCTCGAGCACCTTGGGACTTGTAGTCCCGGAGGGACAG- 

GACGTAGCCCAAGACGATCCCATTTGGATTCACCCAGAGTCCATTTCACAGACAG- 

GAAGGGCGAGGCCCAGAAGCCGAGAGCGACCAGGCCAGGGAGATACAGAAGAGC 

CGAGACGCCTGCCTCGCTGTGGCTGGAGACTGACTCCTGAGCCCTTGCCCCACCCCT- 

tcaggcgcactatcccctttcctgatcagtatcccccagggtctctgagcccgaatc- 

tccccgtcgataaaaagcgcgggttggatcttcaaaggatgtcccagcaagagtt- 

caaaatcttagtttggactacaacccccagcagcctccgcgaccgccctcgggcgac- 

tcttrgcctcgggtcctgtgggaattgtagtcctggagcccgcagggctgcaccc- 

cggtgtctctctcgcccacgcgaaggaaaccgtctggagatcctggataggggaaa- 

catttccccttccccttgaccctccctccgctctggaaagcctctcccacctgggga- 

gaaggggtgccccaattctggagtaggatcctaaatcttggcagagggggcgg-- 

gaagtggcgctgacacactggccaggaatgcagtcgggtcaccctgtctagccac- 

cgtctcgcggctccaaccgccgcccaacgcggggcggccccagtgggaagg- 

GAAGTGGGTGCGTCCCCCAAATCTGTGTCCACGTGCCGCTGTTTACACGCTCCCTGG- 

GGCAGGGAGGAGTCGCCGATCAGGTCCCTTCCTGAAAGTCATCGAGGTTTCCCACG- 

CATGAGACTAAACCCCCGAGGGCATCTACAAGTCCCATTrGATCCACAAACGCTACAC- 

CGTGCCCAGCACCACTCCACGCGTGTGGGGCTCCTGGGTCCGAGGCTCCGCCC- 

TCGAGAACCACAAGCTCCTCCCCCTATGTTTCCCGCTCCCCCGGAGTCCAGAAGCCC- 

CGCCCCTGGCTGGAACTTCACGCCCTCCGGACGGATTGCCCCTATTTCTCCATTTTCC- 

CGCTTCTCCGAGTCAAGTTCTGAACTTGTGAGGCATCTGGGCCTCCCCAGAAGACATT- 

TAACACAGAAAGCACAGCCCTACTAACTAGTATTCTTACCTGTCTCTTCAAGAATTTCA- 

GACCAATCGACCGTCCTGTCTCTTTAAGGCTTAGGAAGAGCAGTGTGGCTGCCCCTT- 

taaggaggcg7tgcaacaaaccatattggacagacgatgggggcgacccatcgg- 
gacccgacgggcctctgactccagcaatAcagcgaatcagcgggtttcgggaata- 

CAm I I rCGGAAAAAGACTTCTTCCTCGGTTTTCTGCTCTGCACACGTTGAAATTTTCC- 

CCAGTTTTTCCTGCAGATCGGGAGTCGAGCAATGCCTACCCCCGCGCTCCCGCAC- 

CAGTTGGGCGCTCCCGGATGATGCCCTACCCCTTTGGATCCACGTGGTCTGCAACCT- 

GGTGCGAGCAGCCCGGGCTACAGGGTTGCCTGAGGTGTGGGTCCCAGGATGGAG- 

GAGCCCCAGGCCGGCGGTGAGGGTGCGGGTTGACGGGGTGCGGAGGGTGCGTTG- 
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ST00AA6GAGAAAGGGGCGTCCGA6AGGGTTCGGGCQQAAAAGGAGGCGTACCTQ- 
CAAGGAGGACTTGCGAAGAGCGTGCATTCCCAGTGGGCGAACGGGAATTCGAACG- 
GAGAGAGGGTTATCTTGTGGGGGGCTACCCGTGGAGAGCAAGGCGCCCCCAGGG- 
GTTGGATCGGTGAAATTGAGGTCGCCCCTGGGGAACAGGTGGGCAGAAAGGA- 
5 GAAACCAGGTTGAGGGGACTGGAGTGCTCACGAGGTTAAGaCCAATGGACCGA- 

TAGGCGCGCCCTGCAAGATTGGACCGGCAAGGAGGTGTCAGTCGACCCCATTTCCCC- 
TTCTGCTGCAGATGCTGCTCGGTTCTCTTGTCCCCCCAACTTTACCGCGAAGCCCC- 
CAGCCTCAGAGTCCCCTCGTTTCTCCTTQGAGGCGCTGACGGGTCCAGATACGQAGC- 
TGTGGCTTATTCAGGCCCCTGCAGACTTTGCCCCAGAATGGTGAGTGGTCTTGTT- 

10 gacggaaaagagggtcccggtccagaccccaagagcgggttcttgaatttgtcacag- 
gaaagaattagaggtgagtcacagagcacagtgaaagaaacaagtttattggaaac- 
tactcctttacagagtagagtgtcctcagaaagcagggggagaaacccacagccct- 
ttgttagtatttctacttataagaaactataagqaacjatagttaaacttggagtgtg- 
cagataagctcactaaaggtaggggctattggtgttatccacgaccattaatcctg- 
1 s caacctaagcttgctcatttatgttatatttaagtaatgggggctgcattcttagga- 
catttggacattctgcaggcttggtggaacatgttctgtatggccataaatattctgta. 

ATTATAATTGGTGGTCAGCCTGGGATGTGGTTA-rnTCAGGCCATAAGCATGAACCTT- 

gtaagtgcctagctactcactttaagatggagtcactctagtcatgttttattaaaaac- 

CAGAGGCCAGCCAGGCGCAGTGGCTGGTGCCTGTAATCCCATCCTTTGGGAGGC- 
20 CGAGGCGAGCAGATCACTTGAGGTCAGGAGTrCAAGACCAGCCTGGCCAACATAGT- 
GAAATTGTCTCTACTAAAAATACAAAAATTGGCTGGGCGTGGTGGCAGGTGCCTGTA- 
ATGCCAGCTACTTGAGAGGCTGAGGCAGGAGAATCGCTTGAACCCAGGAGGTGGA- 
CATTGCAGTGAGCCGAGATCATGCCACTGCACTCCAGCCTAGGCAACAGAGCAAGAC 
TCTCTCAAAAAAAAACAAAAAAAAAATCAAAAAACCTTCCCTCTCCTGTTCCACTTAAG- 
25 CCTCTGCCCTCCCTGTTTGTCTCTGTAGCTTCAATGGGCGGCATGTGCCTCTCTCTGG- 
CTCCCAGATCGTCAAGGGCAAATTGGCAGGCAAGCGGCACCGCTATCGAGTCCTCAG. 

cagctgtccccaagctggagaagcgaccctggtggccccctcaacggaggcaggag- 
gtggactcacctgtgcctcagccccccagggcaccctaaggatccttgagggtccc- 
cagcaatccctgtcagggagccctctgcagcccatcccagcaagtcccccaccaca- 
30 gatccctcctggcctgaggcctcggttctgtgcctttgggggcaacccaccagtca- 
cagggcctaggtcagccttggcccccaacctgctcacctcagggaagaagaaaaag- 
gagatgcaggtgacagaggccccagtcactcaggaggcagtgaatgggcacggggc- 
cctggaggtggacatggct-ttggggtcgccagaaatggatgtgcggaagaagaa. 
gaagaaaaaaaatcagcagctgaaagaaccagaggcagcagggcctgtggggaca- 

35 GAGCCCACAGTGGAGACACTGGAGCCTCTGGGAGTGCTGTTCCCGTCCACCACCAA-* 

gaagaggaagaagcccaaagggaaagaaaccttcgagccagaagacaagacagt- 

GAAGCAGGAACAGATTAACACTGAGCCTCTAGAAGACACAGTCCTGTCCCCGAC- 

caaaaagagaaagaggcaaaaggggacggaagggatggagccagaggagggggt- 

GACAGTTGAGTCTCAGCCACAGGTGAAGGTGGAGCCACTGGAGGAAGCCATCCCTCT. 
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GCCCCCTACGAAGAA.GAGGAAAAAAGAAAAOQGACA6ATGGCAATGAT6GAGCCAG- 

GGACGGAGGCGATGGAGCCAGTGGAGCCGGAGATGAAGCCTCTGGAGTCCCCAGG- 

GGGGACCATGGCGCCTCAACAGCCAGAAGGAGCGAAGCCTCAGGCCCAGGCAGCTC- 

TGGCAGCTCCCAAAAAGAAGACGAAGAAAGAAAAACAGCAAGATGCCACAGTGGAGC 

CAGAGACAGAGGTGGTGGGGCCTGAGCTGCCGGATGACCTTGAGCCTCAGGCAGC- 

TCCCACATCCACCAAGAAGAAGAAGAAGAAGAAAGAGAGAGGTCACACAGTGACT- 

GAGCCAATTCAGCCACTAGAGCCTGAACTGCCAGGGGAGGGACAGCCTGAAGCCAG- 

GGCAACTCCGGGATCCACCAAGAAGAGGAAGAAGCAGAGTCAGGAAAGCOGGATGC- 

CAGAGACAGTGCCCCAAGAGGAGATGCCAGGGCCGCCACTGAATTCAGAGTCTGGG- 

GAGGAGGCTCCCACAGGCCGGGACAAGAAGCGGAAGCAGCAGCAGCAGCAGCCT- 

GTGTAGTCTGCCCCCGGGAAACTGAGGAACTAAAGAAAGCTGAAGGTGCCCACCTG- 

GGCCACCAGAAGGTGACACCCCCAGAATCCCTCCCCAGAGACTGCACCAGCGCAGCC 

Example 7 



The cases and controls in example 6 had been Individually matched with respect to 
age, menopausal status and hormone treatment Therefore, it was possible to make 
a paired analysis. This generally reduces the possibility of bias and confounding, but 
often produces less significant results. When the "high-risk* group was analysed, Le. 
RAM** ASE-lea* 50 ERCCI**. versus. all* other genotypes, we found a rate ratio 
(RR) = 1.64. Confidence Interval (Ci) - 1.17-2.29. and with a level of significance p 
= 0.004. Thus, the -high-risk" genotype was clearly overrepresented among the 
breast cancers. 

Example 8 

In the data of example 7, the "high-risk- group was further analysed, La RAM** 
ASE-1e3 QO ERCCI**, versus an other genotypes, among those pairs that were less 
than 55 years of age. This increased the difference dramatically, indicating that the 
high-risk genotype predisposes to early breast cancer (rate ratio (RR) = 9.5, Confi- 
dence Interval (CI) = 2.21-40.79, and with a level of significance (p) « 0.003). In 
older age brackets, the RR was still above 1. but not significantly so. Thus, the com- 
bination of the three SNPs allows for the definition of a high-risk group for early 
breast cancer. 
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Example 9 

Blood samples were collected from a large number of Danish' citizens and frozen 
• (Example 6). The persons were also Interviewed about a number of issues including 
smoking habits. Alter a number of years those persons, who got lung cancer in the 
intervening period, were identified, as well as a set of matched controls. DNAs were 
purified from the blood samples and a number of polymorphisms, namely XPDelO, 
XPDe23. RAMI, ASE1e1 and .ERCC1e4, Jn and around the region were typed. The 
three latter polymorphisms were combined Into a "high-risk" group that was homo- 
zygous for the high-risk alleles of all three polymorphisms; RAM** ASEIel* 0 
ERCC1e4^. All other genotypes at the three foci were combined into a low-risk 
group (Example 6). XPDelO, and XPDe23 were not combined with other markers. 
The results are shown in Table 15. It is clear that the "high-risk" genotype is associ- 
ated with lung cancer In the youngest age group. XPDe23 shows signs of being as- 
sociated at all age groups, while XPDelO did not appear to relate to the disease. 
Therefore we recalculated the results for the youngest age group without XPDelO. 
Table 16 shows the results. Calculated this way both polymorphisms related to the 
risk of lung cancer. 
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Table 15. The risk of lung cancer In three different age groups in association with 



the high-risk genotype, XPDalO. and XPDe23. mutually adjusted for each other and • 


for the duration of smoking. 








High-risk genotype 


Age at diagno- 


High-risk 


Rate Ratio 


Confidence 


P-value 


sis 


genotype 


(RR) 


Interval (CI) 




SO-SS 


No 


1 








Yes 


4.43 


(1.45-13.56) 


0.009 


56-60 


No 


1 








Yes 


0.73 


(0.30-1.83) 


0.51 


61-70 


No 


1 








Yes 


0.93 




0.82 


XPDelO 


Ageatdiagno-' 


Genotype 


Rate Ratio 


Confidence 


P-vatue (trend) 


sis 




(RR) 


Interval (Ci) 




50-55 


GG 


1 




0.99 




AG 


2.78 


(0.57-13.7) 






AA 


1.2 


(0.14-10.4) 




56-60 


GG 


1 




0.17 




AG 


0.46 


(0.18-1.20) 






AA 


0.41 


(0.09-1.93) 




61-70 


GG 


1 




0.40 




AG 


0.91 


(0.46-1.80) 






AA 


0.64 


(0.25-1.64) 




XPDe23 


Age at diagno- 


Genotype 


Rate Ratio 


■ Confidence 


P-value (trend) 


sis 




(RR) 


Interval (CI) 




50-55 


AA 


1 




0.25 




AC 


1,69 


(0.34-8.41) 






CC 


3.62 


(0.39 - 33.6) 




56-60 


AA 


1* 


0.11 




AC 


1.90 


(0.73 - 4.92) 






CC ' 


3.40 


(0.71 - 16.3) 




61-70 


AA 


1 




0.08 



» 
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AC 1.86 (0.95 - 3.63) 

CC 2.23 {0.79-6.31) 



Table 16. Risk of lung cancer among those 50 - 55 years In association with the 
high-risk genotype and XPDe23, mutually adjusted for each other and for the dura- 
tion of smoking. 



Polymorphism 


Rate Ratio (RR) 






' P-value 


High-risk group 1 


No 


1 








Yes 


4.27 


(1.42- 


12.89) 


0.01 


XPD0 23 


AA 


1 






0.01 a 


AC 


3.20 


(1.13- 


9.02) 




CC 


5.02 


(1.32- 


19.1) 





2 Trend test 
Example 10 



In some of the samples of example 6 we typed a 4 bp deletion (dbSNP#3916791) 
located in the common portion of the sequences §1, S2 end 33 contiguous with 
sequence SEQ ID NO:1. Specifically, the polymorphism is contained in the se- 
quence GCGCCTGCCAAGAT7G7CTGAGTATTGATCGAACCC. where the bases 

1 5 represented with boldface, italicised letters are present In some human chromosome 
19 but not all. The deletion was typed by (1) Performing a PCR on the persons DNA 
with the primers 5'-6~FAM»TGAGACGAGGTGGAGG-3' and 
5'-CAATCAAAAAGAAAACATGG-3'. The fluoroscein-contalnlng (6-FAM) primer 
was obtained from T1B-MOLBIOL (Berlin, Germany), while the other primer was 

20 obtained from DNA-Technofogy (Aarhus. Denmark). The reaction mix contained 

0.84 U Taq polymerase (Roche). 1.7 nmole of each dNTP, 5 pmole of each primer, 
1X PCR buffer (Roche). 1 M betain and approximately 20 ng DNA in a total volume 
of 9 ul. We used a temperature program containing 4 min denaturatlon at 94 C, fol- 
lowed by 30 cycles of 96 C for 1 min, 55C for 30 sec, and 72 C for 4$ sec; (2) We 



84 



12:43 FAX ° 33 ^ pp\ HO I BERG A/S 

P 687 DKD3 

62 

then mixed a sample containing 1 ul PGR product, 0.5 ul GeneScan-500 ROX size 
marker (Applied Biosystems) and 19 ul fonnamide; and (3) loaded the sample onto 
a single lane of Sequagel-6 matrix on a model 3100 Genetic Analyzer (ABl Prism, 
Applied Biosystems) using fluorescence detection. The persons who were homozy- 
gote for the complete fragment gave a length of 167 bp relative to the size markers, 
the persons who were homozygote for the 4 bp deletion gave a length of 163 bp, 
and the heterozygotes showed both lengths In roughly equimolar amounts. Because 
it has repeatedly been observed that the underlying risk-genotype seems recessive 
(Examples 2, 6, 7, 8), we pooled the homozygous low risk genotypes (163/163) and 
the heterogotes (163/167), 

Table 17 shows the observed genotype frequencies among the cases and controls, 
the Odds Ratios for the genotypes, the confidence intervals, and the p-values for the 
Odds Ratios. Clearly, homozygosity for the 167 bp fragment was associated with 
Increased risk of breast cancer. 

Table 17. Risk of breast cancer in association with genotypes of the 4bp deletion in 
SI . 



Genotype 


Number of 
cases 


Number of 
controls 


Odds Ratio 
(OR) 


Confidence 
Interval (CI) 


P-value 


163/163 + 


92 


129 


1 






163/167 












167/167 


60 


44 


1.91 


(1.19-3.07) 


0.007 



Example 11 



The blood samples described in Example 9 were analysed for the 4 bp deletion de- 
scribed in Example 10, and the results were combined with previous results for the 
polymorphism XPDe23. As a preliminary investigation showed the effects of the 
genotypes to be largely additive, we grouped the persons according to the number 
of -risk" alleles they were canylng. using the XPDe23 AA 4bp 183M63 as- the lowest risk,' 
and thus placing, those persons in group 0. and furthermore using them as reference 
for the calculation of the Odds Ratios. Table 1 8 shows the number of cases and 
controls in the different groups, the Odds Ratios for the different groups, the confi- 
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dence intervals for the Odds Ratios and the p-values for the Odds ratios (calculated 
by the two-sided Fisher's exact test)* Clearly, the risk of lung cancer Increased dra- 
matically with the number of rfsk-alleles. 

5 Table 1 8. Risk of lung cancer according to the number of "risk'-alleles In the poly- 
morphisms 4bp and XPDe23. 



Number of 


Number of 


Number of 


Odds Ratio 


Confidence P-value 


"risk'-alteles 


cases 


controls 


(OR) 


Interval (CI) 


0' 


3 


12 


1 




1* 


57 


73 


3.12 


(0.84-11.6) 0.10 


2 a 


123 


129 


3.81 


(1.05-13.8) 0.034 


3* 


49 


35 


5.6 


(1.47 -21 .3) 0.01 


4" 


4 


1 


16 


(1.27 - 200) 0.03 



1 XPDe23 AA 4bp 163 ' 183 

10 2 XPDe23 AC 4bp 18ari83 f and XPDe23 M 4bp 1traner 

3 XPDe23 cc 4bp t * 3/163 . XPDe23 AC 4bp 163M67 . and XPDe23 AA 4bp w/167 

4 XPDe23 cc 4bp 163 ' 167 - and XPDe23 AC 4bp 167 ' 167 
5* XPDe23 cc 4bp 1fl7/107 

15 Example 12' 



The data of Examples 9 and 11 were combined and relative risks for lung cancer for 
the high-risk haplotype. the 4 bp deletion, and XPDe23 mutual adjusted for each 
other were calculated In 3 age-groups. The use of adjusted relative risks ensures 
that the effect of each martter Is peculiar to it, and cannot be attributed any of the 
other markers in question. Tables 19, 20, and 21 show the result After the adjust- 
ment it Is apparent that all three markers have an effect independent of the others. 
Moreover, the adjusted effect of the high-risk hapfotype is strongest among the 
youngest persons, while the adjusted effect of the 4 bp deletion Is strongest in the 
oldest age group. XPDe23 exerts its adjusted effect at all ages, but possibly strong- 
est In the youngest age group. 



86 



29/04 2003 12:43 FAX 333 



P687 DK03 




HOIBERG A/S 



84 



@ 087 



Table 19. Relative risks and 95 percent conficence Intervals for lung cancer in 
different age groups as a reflection of presence or absence of the high-risk 
haplotype in homozygous form, adjusted for the 4bp deletion and XPD023. 



Age at diagnosis 
(YR) 



50-55 



56-60 



61 -70 



Homozygous 0 



RR 



95% CI 



No 



NO 

Yes 

No 



1.00 
4.26 
1.00 
1.07 
1.00 



1.38-13.17 
0.36-2.98 



Yes 



0.82 



0.44-1.53 



a) Homozygous carriers of high-risk haplotype are defined as ERCC1 exon4 A '\ ASE- 



10 



Table 20. Relative risks and 95 percent confidence intervals and p-values for 
trend for lung cancer in different age groups as a reflection of alleles at the 4 
bp deletion site, adjusted for XP0e23 and the high-risk haplotype. 



Age at diagnosis (Yr) | 


Allele 


RR 


95% CI 


P(trend) 


50-55 


163/163 


1.00 




031 




163/167 


1.35 


0.36-5.02 






167/167 


0.35 


0.11-2.87 




56-60 


163/163 


1.00 








163/167 


1.76 


0.58-5.38 


0.75 




167/187 


1.04 


0.26-4.14 




61-70 


163/163 


1.00 




0.02 




163/167 


0.67 


0.36-1.22 






167/167 


0.36 


0.16-0.82 
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Table 21. Relative risks and 95 percent confidence intervals for lung cancer in 
Different age groups as a reflection of alleles at the XPDe23 site, adjusted for 
the high-risk haptotype and the 4 bp deletion. 



Age at diagnosis (Yr) 


A 11— .1—. 

Allele 


RR 


95% CI 


50-55 


AA 


1.00 






AC 


3.13 


0.95-10.33 




CC 


7.86 


1.78-34-64 


58-60 


AA 


' i"bo 






AC 


1.33 


0.60-2.95 




CC 


1.95 


0.63-6.06 


61-70 


AA 


1.00 






AC 


1.81 


1.07-3.07 




CC 


2.54 


1.16-5.58 



» 
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Example 13 

The data of Example 9 concerning the high-risk haplotype were stratified according 
• to age and gender and adjusted for smoking. The results are shown in table 22. It is 
5 obvious that most of the effect of the high-risk haplotype on risk of lung cancer is 
exerted on the young women, while the effect on men at best Is veiy moderate. 

Table 22. Sex and age group specific estimates of the lung cancer rate ratios (RR) 
in association with the high-risk haplotype, adjusted for duration of smoking. 



Age Homozygous Female Male 

group for haplotype 8 RR (95% CI) p RR (95 % CI) p~ 

50-55 ' ~ 

No 1.0 1.0 0.75 

Yes 7.02(1.88-26.18) 0.004 0.60(0.20-3.18) 

56-60 No 1.0 1.0 0.37 

Yes 1.03(0.29-3.71) 0.97 0.69(0.30-1.58) 

61-70 No 1.0 0.76 1.0 0.94 

0.89(0.40-0.76) 1.03(0.48-2.22) 



a) Homozygous carriers of high-risk haplotype are defined as ERCC1 exon4*\ 
ASE-1 exonl RAJ inW* 
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Claims: 

1. ' A method for estimating the cancer risk of an Individual comprising 

- providing a sample from said Individual, 

- assessing in the genetic material in said sample a sequence polymorphism 

- in a region corresponding to SEQ ID NO: 1 f or a part thereof, or 

- in a region complementary to SEQ ID NO: 1 , or a part thereof, or 

- in a transcription product from a sequence in a region corresponding to SEQ 
ID NO: 1 , or a part thereof, of 

- or translation product from a sequence in a region corresponding to SEQ ID 
NO: 1, or a part thereof, 

- obtaining a sequence polymorphism response. 

- estimating the cancer risk of said individual based on the sequence polymor- 
phism response. 

2. The method according to claim 1, wherein the cell sample is a blood sample, a 
tissue sample, a sample of secretion, semen, ovum, a washing of a body sur- 
face, such as a buccal swap, a clipping of a body surface, including hairs and 
nails. 

3. The method according to any of the preceding claims, wherein the cell is se- 
lected from white blood cells and tumor tissue. 

4. The method according to any of the preceding claims, wherein the sequence 
polymorphism comprises at least one mutation base change, 

5. The method according to any of the preceding claims, wherein the sequence 
pplymorphism comprises at least two base changes. 
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6, The method according to any of the preceding claims, wherein the sequence 
polymorphism comprises at least one single nucleotide polymorphism. 



7. 



The method according to any of the preceding claims, wherein the sequence 
polymorphism comprises at least two single nucleotide polymorphisms. 



8; The method according to any of the preceding claims, wherein the sequence 
polymorphism comprises at least one tandem repeat polymorphism. 

10 9. The method according to any of the preceding claims, wherein the sequence 
polymorphism comprises at least two tandem repeat polymorphisms. 

10. The method according to any of the preceding claims, wherein the cancer is se- 
lected from skin carcinoma including malignant melanoma, breast cancer, lung 

1 5 cancer, colon cancer and other cancers in the gastro-intestlnal tract, prostate 

cancer, lymphoma, leukemia, pancreas cancer, head and neck cancer, ovary 
cancer and other gynecological cancers. 

1 1 . The method according to any of the preceding claims, wherein the cancer is ee- 
20 . lected from skin cancer, lung cancer, colon cancer and breast cancer. 

12. The method according to any of the preceding claims, wherein the cancer is se- 
lected from skin cancer and breast cancer. 

25 1 3. The method according to any of the preceding claims 10-12. wherein the skin 
cancer is basal cell carcinoma. 

14. The method according to any of the preceding claims, wherein the assessment 
Is conducted by means of at least one nucleic acid primer or probe, such as a 

30 primer or probe of DNA, RNA or a nucleic acid analogue such as peptide nucleic 

acid (PNA) or locked nucleic acid (LNA). 

15. The method according to claim 14, wherein* the nucleotide primer or probe* is 
capable of hybridising to a subsequence of the region corresponding to SEQ ID • 

35 NO; T, or a part thereof, or a region complementary to SEQ ID NO:1 . 
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16. The method according to claim 14, wherein the primer or probe has a length of 
at least 9 nucleotide or peptide monomers. 

17. The method according to any of the preceding claims 14-1 S, wherein at least 
one primer or probe Is capable of hybridising to a subsequence selected from 
the group of subsequences 

1 . GCTCTGAAAC TTACTAGCCC(A/G)GTATTTATGG AGAGGCATTT 

2. GTGGTCAAAT TCTCATTCAT CGTGG (T/C) CCAGGCAAGC 
ACACTTCCTC 

3. ACCCTGAGGT GAGCACCTGT TCCTT(G/T) TCCTTGCCCT TAGCCCA- 
GAG GTAGA 

4. GGGGAGGGGT TTGTGCCTCC AATGA (G/A) CACAAGCTCC 
CCCTGCCCCC CAACT 

5. CCTGGCGGTG GCCGTCACCA GCTTT (T/C) GGGGGTGTTT 
GGGAAGCTGG 

6. CTCCAGCCCC ACTGTTCCCT (A/G) GGCCCTATTG GTCCCCCTGG 

7. ACAAGGAGGA GGCAGAAGTG AGGTT (G/C) AAACCCACTG CCCAATC- 
TTA 

8. CCAACACGGT GAAACCCCGT CTGTA(T/C)TAAAAATACA AAAATTAGCC 

9. AATCCAGGAC CCCATAATCT TCCGT (d/T) ATCTAAAACA ATA- 
ATGGTGA 

10. CCCAAGGGGG CGAGGGGAGG GTGAA (A/G)GGGTGGGACG 
GGGGCAGCCG 

11. GAAGTGAGAA GGGGGCTGGG GGTCG (G/-) CGCTCGCTAG 
CGGGCGCGGG 

12. CGCACGCGCA GTATCCCGAT TGGCT (C/G)TGCCCTAGCG GATT- 
GACGGG 

13. AACTCCTGGG TTCGATCAAT ACTCA (GACA/-) ATCTTGGCAG 
GCGCAGGAGG 

14. GCTGGGATTACAGGCTTGAG CCACC (A/G) CGCCCGGCCT 
GCAAAGCCAT 

15. TTTTGTATCT TTAGTAGAGA CAGG (T/G) TTTCTCCATG TTGGTCAGGC 
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16. GCCTCAGCCT CCCGAGTAGC TGAGACT (C/A) CAGGTQCCCG CCAC- 
CACGCC 

17. TGAAATTGTA GGTTGAGAGG CCAGGCG (C/T) GGTGCTCACG 
CCTGTAATTT 

5 18.GTTTATAAACATrAAACCAG(T/A)GCTGTGTGAAGGCACTTAAT 

19. CCGTCTCTAT TAAAAATATA AAA (A/C) AA7TTAGCCG GGTGTAGCGG 

20. GGGAGGCTCG AGGCGGGC (A/G) GATTGCATGA GCTCAGGATT 

21. TCCCAAGTTT CAGGGCCCAA (T/G) ATTCTCAAAT CACAGGATTC 

22. TGCAGTGAGC TGAGATCGC (A/G) CCACTGCACT CCAGCCTGGG 
1 0 23. TCTTAGGACG CATGGGGGT (T/G) GAGAGAACGG GGAGATAGAC 

24. CTGGGTTCTA GAACTACC (C/T) ATGCAAACCC AGCTGTTTCC 

25. ATTCTGCCCT GGGTTCTAGA ACTACCT (C/A) TGCAAACCOA 
GCTGTTTCCC 

26. GCTGTTTCCC ACCCCATAAG GCA (A/G) TAGGGGAGCC 
15 CACCTCCGCC 

27. GACCTAGAAG ATCGGTCGAG A (C/T) AGCAGCTTGA GGCTGGCAGG 

28. CTGGCCAGGA ATGCAGTCGG GTCAC (C/T) CTGTCTAGCC 
ACCGTCTCGC 

29. GGGAGGAGTC GCCGATCAGG (C/T) CCCTTCCTGA AAGTCATCGA 

30. GCAGCCCGGG CTACAGGGTT (A/G) CCTGAGGTGT GGGTCCCAGG 

31 . TAGAAATACT AACAAAGGGC (T/C) GTGGGTTTCT CCCCCTGCTT 

32. ACAGGAGAGG GAAGGTTTTTTG (A/T) TTTTTTTTTI U l I 1 1 I I 1 1 1 

33. GAAGAGGAAG AAGCCCAAAG GGA (A/C) AGAAACCTTC GAGCCA- 
GAAG 

25 34. GCGCCTCAAC AGCCAGAAGG AGCG (A/G) AGCCTCAGGC CCAGG- 

CAGCT 

35. TTGAGACTCT CTGTTTGAT (A/G) CTTCACTCAG AAGGTGCTTC 

36. AGGCCAGGCT CCTGCTGGCT G (C/G) GCTGGTGCAG TCTCTGGGGA 

37. CCCCTATACC CTCAAGCAT (C/T) TATCCATTGA GTTACAAACA 
30 38. ACCATCCCCC GCCTTCCGTT (A/C) GTCCGGCCCC CGAGGCTAGC 

or to a sequence complementary to any of the subsequences. " 

18. THe method according to.ctalm 17, wherein at least one nucleotide probe is se- 
35 lected from the group conslsfing of 
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1 TGAAATTGTA GGTTGAGAGG CCAGGCG (C/T) GGTGCTCACG 
CCTGTAATTT 

2. GTTTATAAAC ATTAAACCAG (T/A) GCTGTGTGAA GGCACTTAAT 

3. CCGTCTCTAT TAAAAATATA AAA (A/C) AATTTAGCCG GGTGTAGCGG 

4. GGGAGGCTCG AGGCGGGC (A/G) GATTGCATGA GCTCAGGATT 

5. TCCCAAGTTT CAGGGCCCAA CT/G) ATTCTCAAAT CACAGGATTC 

6. TGCAGTGAGC TGAGATCGC (A/G) CGACTGCACT CCAGCCTGGG 

7. TCTTAGGACG CATGGGGGT (T/G) GAGAGAACGG GGAGATAGAC 

8. CTGGGTTCT A GAACTACC (C/T) ATGCAAACCC AGCTGTTTCC 

9. ATTCTGCCCT GGGTTCTAGA ACTACCT (C/A) TGCAAACCGA 
GCTGTTTCCC 

10. GCTGTTTCCC ACCCCATAAG GCA (A/G) TAGGGGAGCC 
CACCTCCGCC 

11. GACCTAGAAG ATCGGTCGAG A (C/T) AGCAGCTTGA GGCTGGCAGG 

12. CTGGCCAGGA ATGCAGTCGG GTCAC (C/T) CTGTCTAGGC 
ACCGTCTCGC 

13. GGGAGGAGTC GCCGATCAGG (C/T) CCCTTCCTGA AAGTCATCGA 

14. GCAGCCCGGG CTACAGGGTT (A/G) CCTGAGGTGT GGGTCCCAGG 

15. TAGAAATACT AACAAAGGGC (T/C) GTGGGTTTCT CCCCCTGCTT 

16. ACAGGAGAGG GAAGGI I I I I IG (A/T) I I I I I I 1 1 1 1 G il I I I I I I I 

17. GAAGAGGAAG AAGCCCAAAG GGA (A/C) AGAAACCTTC GAGCCA- 
GAAG 

18. GCGCCTCAAC AGCCAGAAGG AGCG (A/G) AGCCTCAGGC CCAGG- 



or to a sequence complementary to any of the subsequences. 

19. The method according to claim 18, wherein at least one nucleotide probe Is se- 
lected from the group consisting of 

1. GTTTATAAAC ATTAAACCAG (T/A) GCTGTGTGAA GGCACTTAAT 

2. CCGTCTCTAT TAAAAATATA AAA (A/C) AATTTAGCCG GGTGTAGCGG 

3. GGGAGGCTCG AGGCGGGC (A/G) GATTGCATGA GCTCAGGATT 

4. TCCCAAGTTT CAGGGCCCAA (T/G) ATTCTCAAAT CACAGGATTC 



CAGCT 
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5. TGCAGTGAGC TGAGATCGC <A/G) CCACTGCACT CCAGCCTGGG 
or to a sequence complementary to any of the subsequences. 

20. The method according to any of the preceding claims, wherein at least one se- 
quence polymorphism is assessed in a region corresponding to SEQ ID NO: 1 . 
position 1521-37752 (r). 

21 . The method according to any of the preceding claims wherein at least one se- 
quence polymorphism is assessed in a region corresponding to SEQ ID NO: 1 
position 7760-22885 (RAI). 

22. The method eccoiding to any of the preceding claims, wherein at least one se- 
quence polymorphism is assessed In a region corresponding to SEQ IO NO: 1 
position 34391- 37752. 

23. The method according to any of the preceding claims, wherein at least two diffe- 
rent probes are used, one probe being selected from the probes as defined In 
any of claims 17-21 , and the other probe being capable of hybridising to a se- 
quence different from SEQ ID NO: 1, or a part thereof, or to a sequence com- 
plementary to a region different from SEQ ID NO: 1, or a part thereof,. 

►4. The method according to claim 1. wherein the translatlonal product from a se- 
quence in a region corresponding to SEQ ID NO: 1, or a part thereof. Is an anti- 
body, such as a monoclonal or polyclonal antibody. 

5. A method for estimating the cancer prognosis of an Individual comprising 

- providing a sample from said individual, 

- assessing in the genetic material In said sample a sequence polymorphism 

- in a region corresponding to SEQ ID NO: 1 , or a part thereof, or 

- in a region complementary to SEQ ID NO: 1 , or a part thereof, or 

- In a transcription product from a sequence In a region corresponding to 
SEQ ID NO: 1 . or a part thereof, or 
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- or translation product from a sequence in a region corresponding to SEQ 
ID NO: 1, or a part thereof, 

- obtaining a sequence polymorphism response, 



- estimating the cancer prognosis of said individual based on the sequence 
polymorphism response. 

26. TTie method according to claim 25, wherein the method has any of the features 
as defined in any of the claims 2-24. 

27. A method for estimating a treatment response of an individual suffering from 
cancer to a cancer treatment, comprising 

- providing a sample from said individual, 

- assessing in the genetic material in said sample a sequence polymorphism 

- in a region corresponding to SEQ ID NO: 1 , or a part thereof, or 

- in a region complementary to SEQ ID NO: 1 , or a part thereof, or 

- in a transcription product from a sequence In a region corresponding to 
SEQ ID NO: 1, or a part thereof, or 

- or translation product from a sequence in a region corresponding to SEQ 
ID NO: 1 , or a part thereof, 

- obtaining a sequence polymorphism response, 

- estimating the individual's response to the cancer treatment based on the 
sequence polymorphism response, 

28. The method according to claim 27. wherein the method has any of the features 
as defined in any of the claims 2-24. 

29. A primer or probe for use in a method as defined In any of the claims above, 
said primer or probe being selected from 

TQGCTAACACGGTGAAACC(SEQ ID NO:7) 



96 




29/04 2003 12:46 FAX 333^00^ HO I BERG A/S 

P 687 DK03 

94 



GGAATCCAAAGATTCTATGATGG(SEQ ID NO:8) 
GGGAGGCGGAGCTTGCAGTGA (SEQ ID NO:9) 
CTGAGATCGCACCACTGCAC (SEQ ID NO:10) 
GGTT7TCTGCTCTGCACACG (SEQ ID NO:1 1 ) 
5 CCTTTCTCCTTCCACCAACG (SEQ ID NO:12) 

CGGGCTACAGGGTTACCTGAG (SEQ ID NO:13) 
TCTGCAACCTGGTGCGAGCAGC (SEQ ID NO:14) 
CCTACCACCATCATCACATCC. (SEQ ID NO:15) 
GCCTTGCCAAAAATCATAACC (SEQ ID NO:16) 
1 0 CCTCTCCCCAATTAAGTGCCTTCACACAGC (SEQ ID NO:17) 

AGCCAGGGAGGTTGAGGCT (SEQ ID NO:18) 
AGACAGCCCTGAATCAGCAC (SEQ ID NO:19) 
GCAATGAGCCGAGATAGAA (SEQ ID NO:20) 
TGGCTAGCCCATTACTCTA (SEQ ID NO:21) 

15 

30. A primer or probe for use In a method as defined in any of the claims above , 
the other probe 

GCCCCGTCCCAGGTA (SEQ ID NO:21) 
20 AGCCCCAAGACCCTTTCACT (SEQ ID NO:22) 

GTCCCATAGATAGGAGTGAAAG (SEQ ID NO:23) 

CCCTAGGACACAGGAGCACA (SEQ ID NO:24) 

TTGTGCTTTCTCTGTGTCCA (SEQ ID NO:25) 

TATCAGAAAAGGCTGGAGGA (SEQ ID NO:26) 
25 GAGTGGCTGGGGAGTAGGA (SEQ ID NO:27) 

GCCAAGCAGAAGAGACAAA (SEQ ID NO:26) 

CCTCAGATGTCCTCTGCTCA (SEQ ID NO:29) 

GCCACAGCCCCAGCAAGTAG (SEQ IDNO:30) 

AGGACCACAGGACACGCAGA (SEQ ID NO:31) 
30 CATAGAACAGTCCAGAACAC (SEQ ID NO:32) 

TTAGCTTGGCACGGCTGTCCAAGGA (SEQ ID NO:33) 

ACAGAATTCGCCCCGGCCTGGTACAC (SEQ ID NO:34) 

TTGAAACTGGAACTCTGAGAAGG (SEQ ID NO:35) 

TGGTGGATGGTGTGAAGCA (SEQ ID NO:36) 
35 CCTTTCTCCAACTTCTTCTCCATTTCCACC (SEQ ID NO:37) 



i 

97 



46 FAX Mi 




HOI BERG A/S 



P6A7DK03 



95 



QOQQATCATGTCGTCAATGGACT (SEQ ID NO:38) 
ATGCCCTGTAGGTTCAATGG (SEQ ID NO:39) 
TGGAGGTCTTTAGGGGCTTG (SEQ ID NO:40) 
GGCTGGTCCCCGTCTTCTCCTTCC (SEQ ID N0:41) 
TCTCTGTTGCCACTTCAGCCTC (SEQ ID NO:42) 
GTCCTGCCCTCAGCAAAGAGAA (SEQ ID NO:43) 
TTCTCCTG CG ATTAAAGG CTGT (SEQ ID NO:44) 
ATCCTGTCCCTACTGGCCATTC (SEQ ID NO:45) 
TGTGGACGTGACAGTGAGAAAT (SEQ ID NO:46) 
TGGAGTGCTATGGCACGATCTCT (SEQ ID NO:47) 
CCATGGGCATCAAATTCCTGGGA (SEQ ID NO:48) 
CACACCTGGCTCAI 1 1 I IGTAT (SEQ ID NO:49) 
TCATCCAGGTTGTAGATGCCA (SEQ ID NO:50) 
AGGCTCAACAAGGAAAAATGC (SEQ ID NO:51) 
GCTAGACAGTCAAGGAGGGACG (SEQ ID NO:52) 
AAAGGGTGGGTGTGGGAGACATTGG (SEQ ID NO:53) 
AAACCAACCTAGGCACCCCAAA (SEQ ID NO:54) 
CAGTGTCCAAAGAGCACC (SEQ ID NO:55) 
CTACCCCTTTAGCQACC (SEQ ID NO:56) 
TCCTGCCCCCAGAGCGTCACC (SEQ ID NO:67) 
GTACGGTCCACATAATTTTGGAGGA (SEQ ID MO:5B) 
CGACGAACTTCTCTGAAGCGAA (SEQ ID NO:59) 
AGCGACACGGGCATCTGG (SEQ ID NO:60> 
ATGAGCGTCCACCTCCTGAACC (SEQ ID NO:61) 
AGGCAGCAGCATCGTCATCCCC (SEQ ID NO:62) 
TGCATAGCTAGGTCCTGC (SEQ ID IMO.63) 

AACTGACRAAACTAGCTCTATGGGGTGGTGCCGCA (SEQ ID NO:64) 
CTGGCTCTGAAACTTACTAGCCC (SEQ ID NO:65) 
GCTGGACTGTCACCGCATG (SEQ ID NO:66) 
GGAGCAGGGTTGGCGTG (SEQ ID NO:67) 
TGCCCTCCCAGAGGTAAGGCCT (SEQ ID NO:68) 
CCCTCCCGGAGGTAAGGCCTC (SEQ IDNO:69) 
GATCAAAGAGAGAGACGAGC (SEQ ID NO:70) 
GAAGCCCAGGAAATGC (SEQ ID NO:71 ) 
GGACGCCCACCTGGCCAACC (SEQ ID NO:72) 
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CGTGCTGCCCAACQAAGTG (SEQ ID NO:73) 

31 „ The primer or probe according to any of claims 28 or 29, wherein the probe is 
operably linked to at least one label, such as operably linked to two different la- 
5 beis/ 

32. The probe according to claim 30. wherein the label is selected from TEX, TET, 
TAM, ROX, R6G, ORG, HEX, FLU, FAM, DABSYL.CyT, CyS, Cy3, BOFL. BOF, 
BO-X, BO-TRX, BO-TMR, JOE, 6JOE, VIC. 6FAM. LCRed640, LCRed705 t 

10 TAMRA. Blotin. Digoxigenin, DuO-family, Oaq-family. 

33. The primer or probe according to any of claims 28-31 , wherein the primer or 
probe is operably linked to a surface. 

15 34. The primer or probe according to claim 32, wherein the surface is the surface of 
microbeads or a DNA chip. 

35. An antibody directed to an epitope of a RAI gene product. 

20 36. A kit for use in ja method as defined in any of the claims above, comprising at 
least one primer or probe, said probe being as defined in any of claims 29-35, 
and optionally further amplifying means for nucleic acid amplification. 
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Subregfon of chromosome 19q 
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Odds ratios and p-values for Individual sequence variations In relation to risk of basal call 




Approximate chromosome position of the sequence variations (Wbasas) 
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p-values for association of different sequence variations with risk of basal cell 
carcinoma among psoriatic Danes 




- a ' 6 J h 5^— \ 

Approximate chromosome positions of the sequence variation* (Mbasos) 
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Fig. 4 



XPD-transIation start*- -* Start sequence Si 

CCCTTCTCACTTCAT GGCGCCGGCCGGACTGTGCAGCOGGGTCGACCCGCCTCCCTCA 
. * XPD-TATA BOX 

TGAATATTCAGCGAGAGGCCGGGTCGTGGACATCCTCGAGGGCTCGCTCCACCWr/ir 
*- -* Start sequence S2 

TA cgagaccattggctaacctgcccgtcaatccgctagggcagagccaatcgggatac 



tgcgcgtgcgcacggaaaagcgagggcggctgactctcgggtgaggcggtgcgggag 

-* Start sequence S3 

GCGTCACTGAGGATCGTCGAGGGCC AATCAAAAAGAAAACATGGAAGGGAAAGAGCC 



GAGAGACTCGATCTCATTCACTAGAATTTO 

End of S sequences*--* Start of region r 
GTATTGATCGAACCCAGGAGTTCGAGATCAGCTTGAGCAAGATAGCG 



4bp deletion 



A 



SI is the sequence from * to A 



S2 is the sequence from m to A 



S3 is the sequence from • to A 
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